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Abstract 



A systematic search of the research literature from 1996 through July 2008 identified more than 
a thousand empirical studies of online learning. Analysts screened these studies to find those that 
(a) contrasted an online to a face-to-face condition, (b) measured student learning outcomes, (c) 
used a rigorous research design, and (d) provided adequate information to calculate an effect 
size. As a result of this screening, 5 1 independent effects were identified that could be subjected 
to meta- analysis. The meta-analysis found that, on average, students in online learning 
conditions performed better than those receiving face-to-face instruction. The difference 
between student outcomes for online and face-to-face classes — measured as the difference 
between treatment and control means, divided by the pooled standard deviation — was larger in 
those studies contrasting conditions that blended elements of online and face-to-face instruction 
with conditions taught entirely face-to-face. Analysts noted that these blended conditions often 
included additional learning time and instructional elements not received by students in control 
conditions. This finding suggests that the positive effects associated with blended learning 
should not be attributed to the media, per se. An unexpected finding was the small number of 
rigorous published studies contrasting online and face-to-face learning conditions for K-12 
students. In light of this small corpus, caution is required in generalizing to the K-12 population 
because the results are derived for the most part from studies in other settings (e.g., medical 
training, higher education). 




Executive Summary 



Online learning — for students and for teachers — is one of the fastest growing trends in 
educational uses of technology. The National Center for Education Statistics (2008) estimated 
that the number of K-12 public school students enrolling in a technology -based distance 
education course grew by 65 percent in the two years from 2002-03 to 2004-05. On the basis of a 
more recent district survey, Picciano and Seaman (2009) estimated that more than a million K- 
12 students took online courses in school year 2007-08. 

Online learning overlaps with the broader category of distance learning, which encompasses 
earlier technologies such as correspondence courses, educational television and 
videoconferencing. Earlier studies of distance learning concluded that these technologies were 
not significantly different from regular classroom learning in terms of effectiveness. Policy- 
makers reasoned that if online instruction is no worse than traditional instruction in terms of 
student outcomes, then online education initiatives could be justified on the basis of cost 
efficiency or need to provide access to learners in settings where face-to-face instruction is not 
feasible. The question of the relative efficacy of online and face-to-face instruction needs to be 
revisited, however, in light of today’s online learning applications, which can take advantage of a 
wide range of Web resources, including not only multimedia but also Web-based applications 
and new collaboration technologies. These forms of online learning are a far cry from the 
televised broadcasts and videoconferencing that characterized earlier generations of distance 
education. Moreover, interest in hybrid approaches that blend in-class and online activities is 
increasing. Policy-makers and practitioners want to know about the effectiveness of Internet- 
based, interactive online learning approaches and need information about the conditions under 
which online learning is effective. 

The findings presented here are derived from (a) a systematic search for empirical studies of the 
effectiveness of online learning and (b) a meta-analysis of those studies from which effect sizes 
that contrasted online and face-to-face instruction could be extracted or estimated. A narrative 
summary of studies comparing different forms of online learning is also provided. 

These activities were undertaken to address four research questions: 

1. How does the effectiveness of online learning compare with that of face-to-face 
instruction ? 

2. Does supplementing face-to-face instruction with online instruction enhance learning? 

3. What practices are associated with more effective online learning? 

4. What conditions influence the effectiveness of online learning? 

This meta-analysis and review of empirical online learning research are part of a broader study 
of practices in online learning being conducted by SRI International for the Policy and Program 
Studies Service of the U.S. Department of Education. The goal of the study as a whole is to 
provide policy-makers, administrators and educators with research -based guidance about how to 
implement online learning for K-12 education and teacher preparation. An unexpected finding of 
the literature search, however, was the small number of published studies contrasting online and 
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face-to-face learning conditions for K-12 students. Because the search encompassed the research 
literature not only on K-12 education but also on career technology, medical and higher 
education, as well as corporate and military training, it yielded enough studies with older learners 
to justify a quantitative meta-analysis. Thus, analytic findings with implications for K-12 
learning are reported here, but caution is required in generalizing to the K-12 population because 
the results are derived for the most part from studies in other settings (e.g., medical training, 
higher education). 

This literature review and meta-analysis differ from recent meta-analyses of distance learning in 
that they 

• Limit the search to studies of Web-based instruction (i.e., eliminating studies of video- 
and audio-based telecourses or stand-alone, computer-based instruction); 

• Include only studies with random-assignment or controlled quasi-experimental designs; 
and 

• Examine effects only for objective measures of student learning (e.g., discarding effects 
for student or teacher perceptions of learning or course quality, student affect, etc.). 

This analysis and review distinguish between instruction that is offered entirely online and 
instruction that combines online and face-to-face elements. The first of the alternatives to 
classroom-based instruction, entirely online instruction, is attractive on the basis of cost and 
convenience as long as it is as effective as classroom instruction. The second alternative, which 
the online learning field generally refers to as blended or hybrid learning, needs to be more 
effective than conventional face-to-face instruction to justify the additional time and costs it 
entails. Because the evaluation criteria for the two types of learning differ, this meta-analysis 
presents separate estimates of mean effect size for the two subsets of studies. 

Literature Search 

The most unexpected finding was that an extensive initial search of the published literature from 
1996 through 2006 found no experimental or controlled quasi-experimental studies that both 
compared the learning effectiveness of online and face-to-face instruction for K-12 students and 
provided sufficient data for inclusion in a meta- analysis. A subsequent search extended the time 
frame for studies through July 2008. 

The computerized searches of online databases and citations in prior meta-analyses of distance 
learning as well as a manual search of the last three years of key journals returned 1,132 
abstracts. In two stages of screening of the abstracts and full texts of the articles, 176 online 
learning research studies published between 1996 and 2008 were identified that used an 
experimental or quasi-experimental design and objectively measured student learning outcomes. 
Of these 176 studies, 99 had at least one contrast between an included online or blended learning 
condition and face-to-face (offline) instruction that potentially could be used in the quantitative 
meta-analysis. Just nine of these 99 involved K-12 learners. The 77 studies without a face-to- 
face condition compared different variations of online learning (without a face-to-face control 
condition) and were set aside for narrative synthesis. 
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Meta-Analysis 



Meta-analysis is a technique for combining the results of multiple experiments or quasi- 
experiments to obtain a composite estimate of the size of the effect. The result of each 
experiment is expressed as an ejfect size, which is the difference between the mean for the 
treatment group and the mean for the control group, divided by the pooled standard deviation. Of 
the 99 studies comparing online and face-to-face conditions, 46 provided sufficient data to 
compute or estimate 51 independent effect sizes (some studies included more than one effect). 
Four of the nine studies involving K-12 learners were excluded from the meta- analysis: Two 
were quasi-experiments without statistical control for preexisting group differences; the other 
two failed to provide sufficient information to support computation of an effect size. 

Most of the articles containing the 5 1 effects in the meta-analysis were published in 2004 or 
more recently. The split between studies of purely online learning and those contrasting blended 
online/face-to-face conditions against face-to-face instruction was fairly even, with 28 effects in 
the first category and 23 in the second. The 5 1 estimated effect sizes included seven contrasts 
from five studies conducted with K-12 learners — two from eighth-grade students in social 
studies classes, one for eighth- and ninth-grade students taking Algebra I, two from a study of 
middle school students taking Spanish, one for fifth-grade students in science classes in Taiwan, 
and one from elementary-age students in special education classes. The types of learners in the 
remaining studies were about evenly split between college or community college students and 
graduate students or adults receiving professional training. All but two of the studies involved 
formal instruction. The most common subject matter was medicine or health care. Other content 
types were computer science, teacher education, mathematics, languages, science, social science, 
and business. Among the 49 contrasts from studies that indicated the time period over which 
instruction occurred, 19 involved instructional time frames of less than a month, and the 
remainder involved longer periods. In terms of instructional features, the online learning 
conditions in these studies were less likely to be instructor-directed (8 contrasts) than they were 
to be student-directed, independent learning (17 contrasts) or interactive and collaborative in 
nature (23 contrasts). 

Effect sizes were computed or estimated for this final set of 5 1 contrasts. Among the 5 1 
individual study effects, 1 1 were significantly positive, favoring the online or blended learning 
condition. Two contrasts found a statistically significant effect favoring the traditional face-to- 
face condition. ' 



* When a a < .05 level of significance is used for contrasts, one would expect approximately 1 in 20 contrasts to 
show a significant difference by chance. For 5 1 contrasts, then, one would expect 2 or 3 significant differences by 
chance. The finding of 2 significant contrasts associated with face-to-face instruction is clearly within the range 
one would expect by chance; the 1 Icontrasts associated with online or hybrid instruction exceeds what one would 
expect by chance. 




Narrative Synthesis 



In addition to the meta-analysis comparing online learning conditions with face-to-face 
instruction, analysts reviewed and summarized experimental and quasi-experimental studies 
contrasting different versions of online learning. Some of these studies contrasted purely online 
learning conditions with classes that combined online and face-to-face interactions. Others 
explored online learning with and without elements such as video, online quizzes, assigned 
groups, or guidance for online activities. Five of these studies involved K-12 learners. 

Key Findings 

The main finding from the literature review was that 

• Few rigorous research studies of the effectiveness of online learning for K-12 students 
have been published. A systematic search of the research literature from 1994 through 
2006 found no experimental or controlled quasi-experimental studies comparing the 
learning effects of online versus face-to-face instruction for K-12 students that provide 
sufficient data to compute an effect size. A subsequent search that expanded the time 
frame through July 2008 identified just five published studies meeting meta-analysis 
criteria. 

The meta-analysis of 51 study effects, 44 of which were drawn from research with older learners, 
found that^ 

• Students who took all or part of their class online performed better, on average, than 
those taking the same course through traditional face-to-face instruction. Learning 
outcomes for students who engaged in online learning exceeded those of students 
receiving face-to-face instruction, with an average effect size of -1-0.24 favoring online 
conditions.^ The mean difference between online and face-to-face conditions across the 
51 contrasts is statistically significant at the p < .01 level.'* Interpretations of this result, 
however, should take into consideration the fact that online and face-to-face conditions 
generally differed on multiple dimensions, including the amount of time that learners 
spent on task. The advantages observed for online learning conditions therefore may be 
the product of aspects of those treatment conditions other than the instructional delivery 
medium per se. 



^ The meta-analysis was ran also with just the 44 studies with older learners. Results were very similar to those for 
the meta-analysis including all 51 contrasts. Variations in findings when K-12 studies are removed are described 
in footnotes. 

^ The + sign indicates that the outcome for the treatment condition was larger than that for the control condition. A 
- sign before an effect estimate would indicate that students in the control condition had stronger outcomes than 
those in the treatment condition. Cohen (1992) suggests that effect sizes of .20 can be considered “small,” those of 
approximately .50 “medium,” and those of .80 or greater “large.” 

The p-value represents the likelihood that an effect of this size or larger will be found by chance if the two 
populations under comparison do not differ. A p-value of less than .05 indicates that there is less than 1 chance in 
20 that a difference of the observed size would be found for samples drawn from populations that do not differ. 
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• Instruction combining online and face-to-face elements had a larger advantage relative 
to purely face-to-face instruction than did purely online instruction. The mean effect size 
in studies comparing blended with face-to-face instruction was -1-0.35, p < .001. This 
effect size is larger than that for studies comparing purely online and purely face-to-face 
conditions, which had an average effect size of -i-0. 14, p < .05. An important issue to keep 
in mind in reviewing these findings is that many studies did not attempt to equate (a) all 
the curriculum materials, (b) aspects of pedagogy and (c) learning time in the treatment 
and control conditions. Indeed, some authors asserted that it would be impossible to have 
done so. Hence, the observed advantage for online learning in general, and blended 
learning conditions in particular, is not necessarily rooted in the media used per se and 
may reflect differences in content, pedagogy and learning time. 

• Studies in which learners in the online condition spent more time on task than students in 
the face-to-face condition found a greater benefit for online learning.^ The mean effect 
size for studies with more time spent by online learners was -1-0.46 compared with -1-0.19 
for studies in which the learners in the face-to-face condition spent as much time or more 
on task (Q = 3.88, p < .05).'^ 

• Most of the variations in the way in which different studies implemented online learning 
did not affect student learning outcomes significantly. Analysts examined 13 online 
learning practices as potential sources of variation in the effectiveness of online learning 
compared with face-to-face instruction. Of those variables, (a) the use of a blended rather 
than a purely online approach and (b) the expansion of time on task for online learners 
were the only statistically significant influences on effectiveness. The other 1 1 online 
learning practice variables that were analyzed did not affect student learning 
significantly. However, the relatively small number of studies contrasting learning 
outcomes for online and face-to-face instruction that included information about any 
specific aspect of implementation impeded efforts to identify online instructional 
practices that affect learning outcomes. 

• The effectiveness of online learning approaches appears quite broad across different 
content and learner types. Online learning appeared to be an effective option for both 
undergraduates (mean effect of -1-0.35, p < .001) and for graduate students and 
professionals (-1-0.17, p < .05) in a wide range of academic and professional studies. 
Though positive, the mean effect size is not significant for the seven contrasts involving 
K-12 students, but the number of K-12 studies is too small to warrant much confidence 
in the mean effect estimate for this learner group. Three of the K-12 studies had 
significant effects favoring a blended learning condition, one had a significant negative 
effect favoring face-to-face instruction, and three contrasts did not attain statistical 
significance. The test for learner type as a moderator variable was nonsignificant. No 



^ This contrast falls just short of statistical significance (p < .06) when the five K-12 contrasts are removed from the 
analysis. 

^ The ^Between Statistic tests whether the variances for the two sets of effect sizes under comparison are statistically 
different. 
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significant differences in effectiveness were found that related to the subject of 
instruction. 

• Ejfect sizes were larger for studies in which the online and face-to-face conditions varied 
in terms of curriculum materials and aspects of instructional approach in addition to the 
medium of instruction. Analysts examined the characteristics of the studies in the meta- 
analysis to ascertain whether features of the studies’ methodologies could account for 
obtained effects. Six methodological variables were tested as potential moderators: (a) 
sample size, (b) type of knowledge tested, (c) strength of study design, (d) unit of 
assignment to condition, (e) instructor equivalence across conditions, and (f) equivalence 
of curriculum and instructional approach across conditions. Only equivalence of 
curriculum and instruction emerged as a significant moderator variable (Q = 5.40, p < 
.05). Studies in which analysts judged the curriculum and instruction to be identical or 
almost identical in online and face-to-face conditions had smaller effects than those 
studies where the two conditions varied in terms of multiple aspects of instruction (- 1 - 0.20 
compared with -1-0.42, respectively). Instruction could differ in terms of the way activities 
were organized (for example as group work in one condition and independent work in 
another) or in the inclusion of instructional resources (such as a simulation or instructor 
lectures) in one condition but not the other. 

The narrative review of experimental and quasi-experimental studies contrasting different online 
learning practices found that the majority of available studies suggest the following: 

• Blended and purely online learning conditions implemented within a single study 
generally result in similar student learning outcomes. When a study contrasts blended 
and purely online conditions, student learning is usually comparable across the two 
conditions. 

• Elements such as video or online quizzes do not appear to influence the amount that 
students learn in online classes. The research does not support the use of some frequently 
recommended online learning practices. Inclusion of more media in an online application 
does not appear to enhance learning. The practice of providing online quizzes does not 
seem to be more effective than other tactics such as assigning homework. 

• Online learning can be enhanced by giving learners control of their interactions with 
media and prompting learner reflection. Studies indicate that manipulations that trigger 
learner activity or learner reflection and self-monitoring of understanding are effective 
when students pursue online learning as individuals. 

• Providing guidance for learning for groups of students appears less successful than does 
using such mechanisms with individual learners. When groups of students are learning 
together online, support mechanisms such as guiding questions generally influence the 
way students interact, but not the amount they learn. 

Conclusions 
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In recent experimental and quasi-experimental studies contrasting blends of online and face-to- 
face instruction with conventional face-to-face classes, blended instruction has been more 
effective, providing a rationale for the effort required to design and implement blended 
approaches. Even when used by itself, online learning appears to offer a modest advantage over 
conventional classroom instruction. 

However, several caveats are in order: Despite what appears to be strong support for online 
learning applications, the studies in this meta-analysis do not demonstrate that online learning is 
superior as a medium, In many of the studies showing an advantage for online learning, the 
online and classroom conditions dijfered in terms of time spent, curriculum and pedagogy. It was 
the combination of elements in the treatment conditions (which was likely to have included 
additional learning time and materials as well as additional opportunities for collaboration) that 
produced the observed learning advantages. At the same time, one should note that online 
learning is much more conducive to the expansion of learning time than is face-to-face 
instruction. 

In addition, although the types of research designs used by the studies in the meta-analysis were 
strong (i.e., experimental or controlled quasi-experimental), many of the studies suffered from 
weaknesses such as small sample sizes; failure to report retention rates for students in the 
conditions being contrasted; and, in many cases, potential bias stemming from the authors’ dual 
roles as experimenters and instructors. 

Finally, the great majority of estimated effect sizes in the meta-analysis are for undergraduate 
and older students, not elementary or secondary learners. Although this meta-analysis did not 
find a significant effect by learner type, when learners’ age groups are considered separately, the 
mean effect size is significantly positive for undergraduate and other older learners but not for 
K-12 students. 

Another consideration is that various online learning implementation practices may have 
differing effectiveness for K-12 learners than they do for older students. It is certainly possible 
that younger students could benefit more from a different degree of teacher or computer-based 
guidance than would college students and older learners. Without new random assignment or 
controlled quasi-experimental studies of the effects of online learning options for K-12 students, 
policy-makers will lack scientific evidence of the effectiveness of these emerging alternatives to 
face-to-face instruction. 
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1. Introduction 



Online learning has roots in the tradition of distance education, which goes back at least 100 
years to the early correspondence courses. With the advent of the Internet and the World Wide 
Web, the potential for reaching learners around the world increased greatly, and today’s online 
learning offers rich educational resources in multiple media and the capability to support both 
real-time and asynchronous communication between instructors and learners as well as among 
different learners. Institutions of higher education and corporate training were quick to adopt 
online learning. Although K-12 school systems lagged behind at first, this sector’s adoption of e- 
learning is now proceeding rapidly. 

The National Center for Education Statistics estimated that 37 percent of school districts had 
students taking technology-supported distance education courses during school year 2004-05 
(Zandberg and Lewis 2008). Enrollments in these courses (which included two-way interactive 
video as well as Internet-based courses), were estimated at 506,950, a 60 percent increase over 
the estimate based on the previous survey for 2002-03 (Selzer and Lewis 2007). Two district 
surveys commissioned by the Sloan Consortium (Picciano and Seaman 2007; 2008) produced 
estimates that 700,000 K-12 public school students took online courses in 2005-06 and over a 
million students did so in 2007-08 — a 43 percent increase.’ Most of these courses were at the 
high school level or in combination elementary-secondary schools (Zandberg and Lewis 2008). 

These district numbers, however, do not fully capture the popularity of programs that are entirely 
online. By fall 2007, 28 states had online virtual high school programs (Tucker 2007). The 
largest of these, the Llorida Virtual School, served over 60,000 students in 2007-08. In addition, 
enrollment figures for courses or high school programs that are entirely online reflect just one 
part of overall K-12 online learning. Increasingly, regular classroom teachers are incorporating 
online teaching and learning activities into their instruction. 

Online learning has become popular because of its potential for providing more flexible access to 
content and instruction at any time, from any place. Lrequently, the focus entails (a) increasing 
the availability of learning experiences for learners who cannot or choose not to attend traditional 
face-to-face offerings, (b) assembling and disseminating instructional content more cost- 
efficiently, or (c) enabling instructors to handle more students while maintaining learning 
outcome quality that is equivalent to that of comparable face-to-face instruction. 

Different technology applications are used to support different models of online learning. One 
class of online learning models uses asynchronous communication tools (e.g., e-mail, threaded 
discussion boards, newsgroups) to allow users to contribute at their convenience. Synchronous 
technologies (e.g., webcasting, chat rooms, desktop audio/video technology) are used to 
approximate face-to-face teaching strategies such as delivering lectures and holding meetings 
with groups of students. Earlier online programs tended to implement one model or the other. 
More recent applications tend to combine multiple forms of synchronous and asynchronous 
online interactions as well as occasional face-to-face interactions. 



’ The Sloan Foundation surveys had very low response rates, suggesting the need for caution with respect to their 
numerical estimates. 
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In addition, online learning offerings are being designed to enhance the quality of learning 
experiences and outcomes. One common conjecture is that learning a complex body of 
knowledge effectively requires a community of learners (Bransford, Brown and Cocking 1999; 
Riel and Polin 2004; Schwen and Kara 2004; Vrasidas and Glass 2004) and that online 
technologies can be used to expand and support such communities. Another conjecture is that 
asynchronous discourse is inherently self-reflective and therefore more conducive to deep 
learning than is synchronous discourse (Harlen and Doubler 2004; Hiltz and Goldman 2005; 
Jaffee et al. 2006). 

This literature review and meta-analysis have been guided by four research questions: 

1. How does the effectiveness of online learning compare with that of face-to-face 
instruction ? 

2. Does supplementing face-to-face instruction with online instruction enhance learning? 

3. What practices are associated with more effective online learning? 

4. What conditions influence the effectiveness of online learning? 

Context for the Meta-analysis and Literature Review 

The meta-analysis and literature review reported here are part of the broader Evaluation of 
Evidence-Based Practices in Online Learning study that SRI International is conducting for the 
Policy and Program Studies Service of the U.S. Department of Education. The overall goal of the 
study is to provide research-based guidance to policy-makers, administrators and educators for 
implementing online learning for K-12 education. This literature search, analysis, and review 
has expanded the set of studies available for analysis by also addressing the literature concerning 
online learning in career technical education, medical and higher education, corporate and 
military training, and K-12 education. 

In addition to examining the learning effects of online learning, this meta-analysis has considered 
the conditions and practices associated with differences in effectiveness. Conditions are those 
features of the context within which the online technology is implemented that are relatively 
impervious to change. Conditions include the year in which the intervention took place, the 
learners’ demographic characteristics, the teacher’s or instructor’s qualifications, and state 
accountability systems. In contrast, concern how online learning is implemented (e.g., 

whether or not an online course facilitator is used). In choosing whether or where to use online 
learning (e.g., to teach mathematics for high school students, to teach a second language to 
elementary students), it is important to understand the degree of effectiveness of online learning 
under differing conditions. In deciding how to implement online learning, it is important to 
understand the practices that research suggests will increase effectiveness (e.g., community 
building among participants, use of an online facilitator, blending work and training). 
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Conceptual Framework for Online Learning 

Modern online learning includes offerings that run the gamut from conventional didactic lectures 
or textbook-like information delivered over the Web to Internet-based collaborative role-playing 
in social simulations and highly interactive multiplayer strategy games. Examples include 
primary-grade students working on beginning reading skills over the Internet, middle school 
students collaborating with practicing scientists in the design and conduct of research, and 
teenagers who dropped out of high school taking courses online to attain the credits needed for 
graduation. The teachers of K-12 students may also participate in online education, logging in to 
online communities and reference centers and earning inservice professional development credit 
online. 

To guide the literature search and review, the research team developed a conceptual framework 
identifying three key components describing online learning: (a) whether the activity served as a 
replacement for or an enhancement to conventional face-to-face instruction, (b) the type of 
learning experience (pedagogical approach), and (c) whether communication was primarily 
synchronous or asynchronous. Each component is described in more detail below. 

One of the most basic characteristics for classifying online activities is its objective — whether 
the activity serves as a replacement for face-to-face instruction (e.g., a virtual course) or as an 
enhancement of the face-to-face learning experience (i.e., online learning activities that are part 
of a course given face-to-face). This distinction is important because the two types of 
applications have different objectives. A replacement application that is equivalent to 
conventional instruction in terms of learning outcomes is considered a success if it provides 
learning online without sacrificing student achievement. If student outcomes are the same 
whether a course is taken online or face-to-face, then online instruction can be used cost- 
effectively in settings where too few students are situated in a particular geographic locale to 
warrant an on-site instructor (e.g., rural students, students in specialized courses). In contrast, 
online enhancement activities that produce learning outcomes that are only equivalent to (not 
better than) those resulting from face-to-face instruction alone would be considered a waste of 
time and money because the addition does not improve student outcomes. 

A second important dimension is the type of learning experience, which depends on who (or 
what) determines the way learners acquire knowledge. Eeaming experiences can be classified in 
terms of the amount of control that the student has over the content and nature of the learning 
activity. In traditional didactic or expository learning experiences, content is transmitted to the 
student by a lecture, written material, or other mechanisms. Such conventional instruction is 
often contrasted with active learning in which the student has control of what and how he or she 
learns. Another category of learning experiences stresses collaborative or interactive learning 
activity in which the nature of the learning content is emergent as learners interact with one 
another and with a teacher or other knowledge sources. Technologies can support any of these 
three types of learning experience: 

• Expository instruction — Digital devices transmit knowledge. 

• Active learning — The learner builds knowledge through inquiry-based manipulation of 
digital artifacts such as online drills, simulations, games, or microworlds. 
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• Interactive learning — The learner builds knowledge through inquiry-based collaborative 
interaction with other learners; teachers become co-leamers and act as facilitators. 

This dimension of learning-experience type is closely linked to the concept of learner control 
explored by Zhang (2005). Typically, in expository instruction, the technology delivers the 
content. In active learning, the technology allows students to control digital artifacts to explore 
information or address problems. In interactive learning, technology mediates human interaction 
either synchronously or asynchronously; learning emerges through interactions with other 
students and the technology. 

The learner-control category of interactive learning experiences is related to the so-called “fifth 
generation” of distance learning, which stresses a flexible combination of independent and group 
learning activities. Researchers are now using terms such as “distributed learning” (Dede 2006) 
or “learning communities” to refer to orchestrated mixtures of face-to-face and virtual 
interactions among a cohort of learners led by one or more instructors, facilitators or coaches 
over an extended period of time (from weeks to years). 

Finally, a third characteristic commonly used to categorize online learning activities is the extent 
to which the activity is synchronous, with instruction occurring in real time whether in a physical 
or a virtual place, or asynchronous, with a time lag between the presentation of instructional 
stimuli and student responses. Exhibit 1 illustrates the three dimensions in the framework 
guiding this meta-analysis of online learning offerings. The descriptive columns in the table 
illustrate uses of online learning comprising dimensions of each possible combination of the 
learning experience, synchronicity, and objective (an alternative or a supplement to face-to-face 
instruction). 
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Exhibit 1. Conceptuai Framework for Oniine Learning 



Learning 

Experience 

Dimension 


Synchronicity 


Face- to- Face 
Aiternative 


Face- to- Face 
Enhancement 


Expository 


Synchronous 


Live, one-way webcast of online lecture course 
with limited learner control (e.g., students 
proceed through materials in set sequence) 


Viewing webcasts to supplement in-class learning 
activities 


Asynchronous 


Math course taught through online video lectures 
that students can access on their own schedule 


Online lectures on advanced topics made 
available as a resource for students in a 
conventional math class 


Active 


Synchronous 


Learning how to troubleshoot a new type of 
computer system by consulting experts through 
live chat 


Chatting with experts as the culminating activity for 
a curriculum unit on network administration 


Asynchronous 


Social studies course taught entirely through 
Web quests that explore issues in LJ.S. history 


Web quest options offered as an enrichment 
activity for students completing their regular social 
studies assignments early 


Interactive 


Synchronous 


Health-care course taught entirely through an 
online, collaborative patient management 
simulation that multiple students interact with at 
the same time 


Supplementing a lecture-based course through a 
session spent with a collaborative online 
simulation used by small groups of students 




Asynchronous 


Professional development for science teachers 
through “threaded” discussions and message 


Supplemental, threaded discussions for pre- 
service teachers participating in a face-to-face 






boards on topics identified by participants 


course on science methods 



Exhibit reads: Online learning applications can be characterized in terms of (a) the kind of learning experience they provide, (b) whether 
computer-mediated instruction is primarily synchronous or asynchronous and (c) whether they are intended as an alternative or a supplement to 
face-to-face instruction. 
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Many other features also apply to online learning, including the type of setting (classroom, 
home, informal), the nature of the content (both the subject area and the type of learning such as 
fact, concept, procedure or strategy), and the technology involved (e.g., audio/video streaming, 
Internet telephony, podcasting, chat, simulations, videoconferencing, shared graphical 
whiteboard, screen sharing). 

The dimensions in the framework in Exhibit 1 were derived from prior meta-analyses in distance 
learning. Bernard et al. (2004) found advantages for asynchronous over synchronous distance 
education. In examining a different set of studies, Zhao et al. (2005) found that studies of 
distance-learning applications that combined synchronous and asynchronous communication 
tended to report more positive effects than did studies of distance learning applications with just 
one of these interaction types.* Zhao et al. also found (a) advantages for blended learning (called 
“Face-to-Face Enhancement” in the Exhibit 1 framework) over purely online learning 
experiences and (b) advantages for courses with more instructor involvement compared with 
more “canned” applications that provide expository learning experiences. Thus, the three 
dimensions in Exhibit 1 capture some of the most important kinds of variation in distance 
learning and together provide a manageable framework for differentiating among the broad array 
of online activities in practice today. 

Findings From Prior Meta-Anaiyses 

Prior meta-analyses of distance education (including online learning studies and studies of other 
forms of distance education) and of Web-based or online learning have been conducted. Overall, 
results from Bernard et al. (2004) and other reviews of the distance education literature 
(Cavanaugh 2001; Moore 1994) indicate no significant differences in effectiveness between 
distance education and face-to-face education, suggesting that distance education, when it is the 
only option available, can successfully replace face-to-face instruction. Findings of a recent 
meta-analysis of job-related courses comparing Web-based and classroom-based learning 
(Sitzmann et al. 2006) were even more positive. They found online learning to be superior to 
classroom-based instruction in terms of declarative knowledge outcomes, with the two being 
equivalent in terms of procedural learning. 

However, a general conclusion that distance and face-to-face instruction result in essentially 
similar learning ignores differences in findings across various studies. Bernard et al. (2004) 
found tremendous variability in effect sizes (an effect size is the difference between the mean for 
the treatment group and the mean for the control group, divided by the pooled standard 
deviation), which ranged from -1.31 to -1-1.41.® From their meta- analysis, which included coding 
for a wide range of instructional and other characteristics, the researchers concluded that selected 



Both of these meta-analyses included video-based distance learning as well as Web-based learning and also 
included studies in which the outcome measure was student satisfaction, attitude or other nonleaming measures. 
The meta-analysis reported here is restricted to an analysis of effect sizes for objective student learning measures 
in experimental, controlled quasi-experimental, and crossover studies of applications with Web-based 
components. 

® Cohen (1992) suggests that effect sizes of .20 can be considered “small,” those of approximately .50 “medium,” 
and those of .80 or greater “large.” 
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conditions and practices were associated with differences in outcomes. For example, they found 
that distance education that used synchronous instruction was significantly negative in its effect, 
with an average effect size of -0.10, whereas the average effect size for studies using 
asynchronous instruction was significantly positive (+0.05). However, the studies that Bernard et 
al. categorized as using synchronous communication involved “yoked” classrooms; that is, the 
instructor’s classroom was the center of the activity, and one or more distant classrooms 
interacted with it in “hub and spoke” fashion. These satellite classes are markedly different from 
today’s Web-based communication among the multiple nodes in a “learning network.” 

Machtmes and Asher’s earlier (2000) meta-analysis of telecourses sheds light on this issue.'” 
Although detecting no difference between distance and face-to-face learning overall, they found 
results more favorable for telecourses when classrooms had two-way, as opposed to one-way, 
interactions. 

Although earlier meta-analyses of distance education found it equivalent to classroom instruction 
(as noted above), several reviewers have suggested that this pattern may change. They argue that 
online learning as practiced in the 21st century can be expected to outperform earlier forms of 
distance education in terms of effects on learning (Zhao et al. 2005). 

The meta-analysis reported here differs from earlier meta-analyses because its focus has been 
restricted to studies that did the following: 

• Investigated significant use of the Web for instruction 

• Had an objective learning measure as the outcome measure 

• Met higher quality criteria in terms of study design (i.e., an experimental or controlled 
quasi-experimental design) 

Structure of the Report 

Chapter 2 describes the methods used in searching for appropriate research articles, in screening 
those articles for relevance and study quality, in coding study features, and in calculating effect 
sizes. Chapter 3 describes the 51 study effects identified through the article search and screening 
and presents findings in the form of effect sizes for studies contrasting purely online or blended 
learning conditions with face-to-face instruction. Chapter 4 provides a qualitative narrative 
synthesis of research studies comparing variations of online learning interventions. Finally, 
chapter 5 discusses the implications of the literature search and meta-analysis for future studies 
of online learning and for K-12 online learning practices. 



Like the present meta-analysis, Machtmes and Asher limited their study corpus to experiments or quasi- 
experiments with an achievement measure as the learning outcome. 
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2. Methodology 



This chapter describes the procedures used to search for, screen and code controlled studies of 
the effectiveness of online learning. The products of these search, screening and coding activities 
were used for the meta-analysis and narrative literature review, which are described in chapters 3 
and 4, respectively. 

Definition of Oniine Learning 

For this review, online learning is defined as learning that takes place partially or entirely over 
the Internet. This definition excludes purely print-based correspondence education, broadcast 
television or radio, videoconferencing, videocassettes, and stand-alone educational software 
programs that do not have a significant Internet-based instructional component. 

In contrast to previous meta-analyses, this review distinguishes between two purposes for online 
learning: 

• Learning conducted totally online as a substitute or alternative to face-to-face learning 

• Online learning components that are combined or blended (sometimes called “hybrid”) 
with face-to-face instruction to provide learning enhancement 

As indicated in chapter 1, this distinction was made because of the different implications of 
finding a null effect (i.e., no difference in effects between the treatment and the control group) 
under the two circumstances. Equivalence between online learning and face-to-face learning 
justifies using online alternatives, but online enhancements need to be justified by superior 
learning outcomes. These two purposes of online learning defined the first two categories of 
study in the literature search: 

• Studies comparing an online learning condition with a face-to-face control condition 
(Category 1) 

• Studies comparing a blended condition with a face-to-face control condition without the 
online learning components (Category 2). 

In addition, researchers sought experimental and controlled quasi-experimental studies that 
compared the effectiveness of different online learning practices. This third study category 
consisted of the following: 

• Studies testing the learning effects of variations in online learning practices such as 
online learning with and without interactive video (Category 3). 
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Data Sources and Search Strategies 

Relevant studies were located through a comprehensive search of publicly available literature 
published from 1996 through July 2008." Searches of dissertations were limited to those 
published from 2005 through July 2008 to allow researchers to use UMI ProQuest Digital 
Dissertations for retrieval. 

Electronic Database Searches 

Using a common set of keywords, searches were performed in five electronic research databases: 
ERIC, PsyclNFO, PubMed, ABl/lNFORM, and UMI ProQuest Digital Dissertations. The 
appendix lists the terms used for the initial electronic database search and for additional searches 
for studies of online learning in the areas of career technical education and teacher education. 

Additional Search Activities 

The electronic database searches were supplemented with a review of articles cited in recent 
meta-analyses and narrative syntheses of research on distance learning (Bernard et al. 2004; 
Cavanaugh et al. 2004; Childs 2001; Sitzmann et al. 2006; Tallent- Runnels et al. 2006; WestEd 
with Edvance Research 2008; Wisher and Olson 2003; Zhao et al. 2005), including those for 
teacher professional development and career technical education (Whitehouse et al. 2006; Zirkle 
2003). The analysts examined references from these reviews to identify studies that might meet 
the criteria for inclusion in the present review. 

Abstracts were manually reviewed for articles published since 2005 in the following key 
journals: American Journal of Distance Education, Journal of Distance Education (Canada), 
Distance Education (Australia), International Review of Research in Distance and Open 
Education, and Journal of Asynchronous Learning Networks. In addition, the Journal of 
Technology and Teacher Education and Career and Technical Education Research were 
searched manually. Einally, the Google Scholar search engine was used with a series of 
keywords related to online learning (available from the authors). Article abstracts retrieved 
through these additional search activities were reviewed to remove duplicates of articles 
identified earlier. 



" Literature searches were performed in two waves: in March 2007 for studies published from 1996-2006 and in 
July 2008 for studies published from 2007 to July 2008. 
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Screening Process 

Screening of the research studies obtained through the search process described above was 
carried out in two stages. The intent of the two-stage approach was to gain efficiency without 
risking exclusion of potentially relevant, high-quality studies of online learning effects. 

Initial Screen for Abstracts From Electronic Databases 

The initial electronic database searches (excluding the additional searches conducted for teacher 
professional development and career technical education) yielded 1,132 articles.'^ Citation 
information and abstracts of these studies were examined to ascertain whether they met the 
following three initial inclusion criteria: 

1. Does the study address online learning as this review defines it? 

2. Does the study appear to use a controlled design ( experimental/quasi-experimental 
design)? 

3. Does the study report data on student achievement or another learning outcome ? 

At this early stage, analysts gave studies “the benefit of the doubt,” retaining those that were not 
clearly outside the inclusion criteria on the basis of their citations and abstracts. As a result of 
this screening, 316 articles were retained and 816 articles were excluded. During this initial 
screen, 45 percent of the articles were excluded primarily because they did not have a controlled 
design. Twenty-six percent of articles were eliminated because they did not report learning 
outcomes for treatment and control groups. Twenty-three percent were eliminated because their 
intervention did not qualify as online learning, given the definition used for this meta-analysis 
and review. The remaining six percent of the articles were excluded for other reasons such as 
articles written in a language other than English. 

Full-text Screen 

From the other data sources (i.e., references in earlier reviews, manual review of key journals, 
recommendation from a study advisor, and Google Scholar searches), researchers identified and 
retrieved an additional 186 articles, yielding a total of 502 articles that they subjected to a full- 
text screening for possible inclusion in the analysis. Nine analysts who were trained on a set of 
full-text screening criteria reviewed the 502 articles for both topical relevance and study quality. 

A study had to meet content relevance criteria to be included in the meta-analysis. Thus, 
qualifying studies had to 

1. Involve learning that took place over the Internet. The use of the Internet had to be a 
major part of the intervention. Studies in which the Internet was only an incidental 
component of the intervention were excluded. In operational terms, to qualify as online 
learning, a study treatment needed to provide at least a quarter of the 



This number includes multiple instances of the same study identified in different databases. 
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instruction/learning of the content assessed by the study’s learning measure by means 
of the Internet. 



2. Contrast conditions that varied in terms of use of online learning. Learning outcomes 
had to be compared against conditions falling into at least one of two study categories: 
Category 1, online learning compared with offline/face-to-face learning, and Category 
2, a combination of online plus offline/face-to-face learning (i.e., blended learning) 
compared with offline/face-to-face learning alone. 

3. Describe an intervention study that had been completed. Descriptions of study designs, 
evaluation plans or theoretical frameworks were excluded. The length of the 
intervention/treatment could vary from a few hours to a quarter, semester, year or 
longer. 

4. Report a learning outcome that was measured for both treatment and control groups. 

A learning outcome needed to be measured in the same way across study conditions. A 
study was excluded if it explicitly indicated that different examinations were used for 
the treatment and control groups. The measure had to be objective and direct; learner or 
teacher/instructor self-report of learning was not considered a direct measure. 

Examples of learning outcome measures included scores on standardized tests, scores 
on researcher-created assessments, grades/scores on teacher-created assessments (e.g., 
assignments, midterm/final exams), and grades or grade point averages. Examples of 
learning outcome measures for teacher learners (in addition to those accepted as 
student outcomes) included assessments of content knowledge, analysis of lesson plans 
or other materials related to the intervention, observation (or logs) of class activities, 
analysis of portfolios, or supervisor’s rating of job performance. Studies that used only 
nonlearning outcome measures (e.g., attitude, retention, attendance, level of 
learner/instructor satisfaction) were excluded. 

Studies also had to meet basic Quality (method) criteria to be included. Thus, qualifying studies 
had to 

5. Use a controlled design (experimental or quasi-experimental). Design studies, 
exploratory studies or case studies that did not use a controlled research design were 
excluded. Eor quasi-experimental designs, the analysis of the effects of the intervention 
had to include statistical controls for possible differences between the treatment and 
control groups in terms of prior achievement. 

6. Report sufficient data for effect size calculation or estimation as specified in the 
guidelines provided by the What Works Clearinghouse (2007) and by Eipsey and 
Wilson (2000). 

Studies that contrasted different versions of online learning (Category 3) needed to meet Criteria 
1 and 3-5 to be included in the narrative research summary. 
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An analyst read each full text, and all borderline cases were discussed and resolved either at 
project meetings or through consultation with task leaders. To prevent studies from being 
mistakenly screened out, two analysts coded studies on features that were deemed to require 
significant degrees of inference. These features consisted of the following: 

• Failure to have students use the Internet for a significant portion of the time that they 
spent learning the content assessed by the study’s learning measure 

• Lack of statistical control for prior abilities in quasi-experiments 

From the 502 articles, analysts identified 522 independent studies (some articles reported more 
than one study). When the same study was reported in different publication formats (e.g., 
conference paper and journal article), only the more formal journal article was retained for the 
analysis. 

Of the 522 studies, 176 met all the criteria of the full-text screening process. Exhibit 2 shows the 
bases for exclusion for the 346 studies that did not meet all the criteria. 



Exhibit 2. Bases for Exciuding Studies During the Fuii-Text Screening Process 



Primary Reason for Exclusion 


Number 

Excluded 


Percentage 

Excluded 


Did not use statistical control 


137 


39 


Was not online as defined in this review 


90 


26 


Did not analyze learning outcomes 


52 


15 


Did not have a comparison group that received a comparable 
treatment 


22 


7 


Did not fit into any of the three study categories 


39 


11 


Excluded for other reasons® 


6 


2 



Exhibit reads: The most common reason for a study’s exclusion from the analysis was failure to use statistical 
control (in a quasi-experiment). 

^Other reasons for exclusion included (a) did not provide enough information, (b) was written in a language other than 
English, and (c) used different learning outcome measures for the treatment and control groups. 



Effect Size Extraction 

Of the 176 independent studies, 99 had at least one contrast between online learning and face-to- 
face/offline learning (Category 1) or between blended learning and face-to-face/offline learning 
(Category 2). These studies were subjected to quantitative analysis to extract effect sizes. 

Of the 99 studies, only nine were conducted with K-12 students (Chang 2008; Englert et al. 
2007; Eong and Jennings 2005; O’Dwyer, Carey and Kleiman 2007; Parker 1999; Rockman et 
al. 2007; Stevens 1999; Sun, Ein and Yu 2008; Uzunboylu 2004). Of them, four were excluded 
from the meta-analysis: Chang (2008), Parker (1999), and Uzunboylu (2004) did not provide 
sufficient statistical data to compute effect sizes, and the Stevens (1999) study was a quasi- 
experiment without a statistical control for potential existing differences in achievement. 
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An effect size is similar to a z-score in that it is expressed in terms of units of standard deviation. 
It is defined as the difference between the treatment and control means, divided by the pooled 
standard deviation. Effect sizes can be calculated (a) from the means and standard deviations for 
the two groups or (b) on the basis of information provided in statistical tests such as t-tests and 
analyses of variance. Following the guidelines from the What Works Clearinghouse (2007) and 
Lipsey and Wilson (2000), numerical and statistical data contained in the studies were extracted 
so that Comprehensive Meta- Analysis software (Biostat Solutions 2006) could be used to 
calculate effect sizes (g). The precision of each effect estimate was determined by using the 
estimated standard error of the mean to calculate the 95-percent confidence interval for each 
effect. 

The review of the 99 studies to obtain the data for calculating effect size produced 5 1 
independent effect sizes (28 for Category 1 and 23 for Category 2) from 46 studies. Fifty-three 
studies did not report sufficient data to support calculating effect size. 

Coding of Study Features 

All studies that provided enough data to compute an effect size were coded for their study 
features and for study quality. Building on the project’s conceptual framework and the coding 
schemes used in several earlier meta-analyses (Bernard et al. 2004; Sitzmann et al. 2006), a 
coding structure was developed and pilot-tested with several studies. The top-level coding 
structure, incorporating refinements made after pilot testing, is shown in Exhibit A-4 of the 
appendix. 

To determine interrater reliability, two researchers coded 20 percent of the studies, achieving an 
interrater reliability of 86 percent across those studies. Analysis of coder disagreements resulted 
in the refinement of some definitions and decision rules for some codes; other codes that 
required information that articles did not provide or that proved difficult to code reliably were 
eliminated (e.g., whether or not the instructor was certified). A single researcher coded the 
remaining studies. 
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Data Analysis 

Before combining effects from multiple contrasts, effect sizes were weighted to avoid undue 
influence of studies with small sample sizes (Hedges and Olkin 1985). For the total set of 51 
contrasts and for each subset of contrasts being investigated, a weighted mean effect size 
(Hedges’ g+) was computed by weighting the effect size for each study contrast by the inverse of 
its variance. The precision of each mean effect estimate was determined by using the estimated 
standard error of the mean to calculate the 95 percent confidence interval. Using a fixed-effects 
model, the heterogeneity of the effect size distribution (the ^-statistic) was computed to indicate 
the extent to which variation in effect sizes was not explained by sampling error alone. 

Next, a series of post-hoc subgroup and moderator variable analyses were conducted using the 
Comprehensive Meta- Analysis software. A mixed-effects model was used for these analyses to 
model within-group variation. A between-group heterogeneity statistic (^Between) was computed 
to test for statistical differences in the weighted mean effect sizes for various subsets of the 
effects (e.g., studies using blended as opposed to purely online learning for the treatment group). 
Chapter 3 describes the results of these analyses. 



Meta-analysts need to choose between a mixed-effects and a fixed-effects model for investigating moderator 
variables. A fixed-effects analysis is more sensitive to differences related to moderator variables, but has a greater 
likelihood of producing Type I errors (falsely rejecting the null hypothesis). The mixed-effects model reduces the 
likelihood of Type I errors by adding a random constant to the standard errors, but does so at the cost of 
increasing the likelihood of Type II errors (incorrectly accepting the null hypothesis). Analysts chose the more 
conservative mixed-effects model for this investigation of moderator variables. 
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3. Findings 



This chapter presents the results of the meta-analysis of controlled studies that compared the 
effectiveness of online learning with that of face-to-face instruction. The next chapter presents a 
narrative synthesis of studies that compared different versions of online learning with each other 
rather than with a face-to-face control condition. 

Nature of the Studies in the Meta-Anaiysis 

As indicated in chapter 2, 5 1 independent effect sizes could be abstracted from the study corpus 
of 46 studies.'"^ The number of students in the studies included in the meta-analysis ranged from 
16 to 1,857, but most of the studies were modest in scope. Although large-scale applieations of 
online learning have emerged, only five studies in the meta-analysis corpus included more than 
400 learners. The types of learners in these studies were about evenly split between students in 
college or earlier years of education and learners in graduate programs or professional training. 
The average learner age ranged from 13 to 44. Nearly all the studies involved formal instruction, 
with the most common subject matter being medicine or health care. Other content types 
included computer science, teacher education, social science, mathematics, languages, science 
and business. Roughly half of the learners were taking the instruction for credit or as an 
aeademic requirement. Of the 49 contrasts for which the study indicated the length of instruction, 
19 involved instructional time frames of less than a month and the remainder involved longer 
periods. 

In terms of instructional features, the online learning conditions in these studies were less likely 
to be instructor-directed (8 contrasts) than they were to be student-directed, independent learning 
(17 contrasts) or interactive and collaborative in nature (23 contrasts). Online learners typieally 
had opportunities to practice skills or test their knowledge (42 effects were from studies 
reporting such opportunities). Opportunities for learners to receive feedback were less common; 
however, it was reported in the studies associated with 24 effects. The opportunity for online 
learners to have face-to-face contact with the instructor during the time frame of the course was 
present in the case of 21 out of 51 effects. The details of instructional media and communication 
options available to online learners were absent in many of the study narratives. Among the 51 
contrasts, analysts could document the presence of one-way video or audio in the online 
condition for 15 effects. Similarly, 16 contrasts involved online conditions with asynchronous 
communication only; 9 involved both asynchronous and synchronous online communication; and 
26 contrasts came from studies that did not doeument the types of online communieation 
provided to learners. 



After the first literature search, which yielded 29 independent effects, the research team ran additional analyses to 
find out how many more studies could be included if the study design criterion were relaxed to include quasi- 
experiments with pre- and posttests with no statistical adjustments made for preexisting differences. The relaxed 
standard would have increased the corpus for analysis by just 10 studies, nearly all of which were in Category 1 
and which had more positive effect sizes than the Category 1 studies with stronger analytic designs. Analysts 
decided not to include those studies in the meta-analysis. Instead, the study corpus was enlarged by conducting a 
second literature search in July 2008. 
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Among the 5 1 individual contrasts between online and face-to-face instruction, 1 1 were 
significantly positive, favoring the online or blended learning condition. Two significant 
negative effects favored traditional face-to-face instruction. The fact that multiple comparisons 
were conducted should be kept in mind when interpreting this pattern of findings. Because 
analysts used a a < .05 level of significance for testing differences, one would expect 
approximately 1 in 20 contrasts to show a significant difference by chance alone. For 5 1 
contrasts, then, one would expect 2 or 3 significant differences by chance. The finding of 2 
significant contrasts favoring face-to-face instruction is clearly within the range one would 
expect by chance; the 1 1 contrasts favoring online or hybrid instruction exceeds what one would 
expect by chance. 

Exhibit 3 illustrates the 51 effect sizes derived from the 46 articles.'^ Exhibits 4a and 4b present 
the effect sizes for Category 1 (purely online versus face-to-face) and Category 2 (blended versus 
face-to-face) studies, respectively, along with standard errors, statistical significance, and the 95- 
percent confidence interval. 

Main Effects 

The overall finding of the meta-analysis is that classes with online learning (whether taught 
completely online or blended) on average produce stronger student learning outcomes than do 
classes with solely face-to-face instruction. The mean effect size for all 5 1 contrasts was -1-0.24, 

p < .001. 

The conceptual framework for this study, which distinguishes between purely online and blended 
forms of instruction, calls for creating subsets of the effect estimates to address two more 
nuanced research questions: 

• How does the effectiveness of online learning compare with that of face to-face 

instruction ? Eooking only at the 28 Category 1 effects that compared a purely online 
condition with face-to-face instruction, analysts found a mean effect of -i-0.14,p < .05. 
This finding is more positive than those of previous summaries of distance learning 
(generally from pre-Internet studies), most of which concluded that learning at a distance 
is as effective as classroom instruction but no better. 



Some references appear twice in Exhibit 3 because multiple effect sizes were extracted from the same article. 
Davis et al. (1999) and Caldwell (2006) each included two contrasts — online versus face-to-face (Category 1) and 
blended versus face-to-face (Category 2). Rockman et al. (2007) and Schilling et al. (2006) report findings for two 
distinct learning measures. Long and Jennings (2005) report findings from two distinct experiments, a “wave 1” in 
which teachers were implementing online learning for the first time and a “wave 2” in which teachers 
implemented online learning a second time with new groups of students. 
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• Does supplementing face-to-face instruction with online instruction enhance learning? 
For the 23 Category 2 contrasts that compared blended conditions of online plus face-to- 
face learning with face-to-face instruction alone, the mean effect size of -1-0.35 was 
significant (p < .0001). Blends of online and face-to-face instruction, on average, had 
stronger learning outcomes than did face-to-face instruction alone. 

A test of the difference between Category 1 and Category 2 studies found that the mean effect 
size was larger for contrasts pitting blended learning against face-to-face instruction (g-i- = -1-0.35) 
than for those of purely online versus face-to-face instruction (g-i- = -1-0.14); the difference 
between the two subsets of studies was statistically significant {Q = 4.98, p < .05). 
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Exhibit 3. Effect Sizes for Contrasts in the Meta-Anaiysis 



Study 


Hedges g and 95% Confidence Interval 




2 -1 


0 1 2 


Study Category 1 












Schoenfeid-Tacher, McConnell and Graham (2001) 












Beyea et al. (2008) 












Jang et al. (2005) 












Zhang et al. (2006) 






■ 






Cavus, Uzunboylu and Ibrahim (2007) 












Zhang (2005) 












Davis et al. (1999) (online] 












Nguyen et al. (2008) 












Harris et al. (2008) 












Turner et al. (2006) 












Bello et al. (2005) 












Beeckman et al. (2008) 












Sun et al. (2008) 












Vandeweerd et al. (2007) 












Caldwell (2006) [online] 












Padalino and Peres (2007) 












Hugenholtz et al. (2008) 












Wallace and Clarlana (2000) 












Peterson and Bond (2004) 












LaRose, Gregg and Eastin (1998) 












Benjamin et al. (2007) 












Hairston (2007) 












Ocker and Yaverbaum (1999) 












Wang (2008) 




• 








Schmeeckle (2003) 




• 








Lowry (2007) 




• 








Sexton, Raven and Newman (2002) 












Mentzer, Cryan and Teclehaimanot (2007) 












Study Category 2 












Day. Raven and Newman (1998) 












Englert et al. (2007) 






" ^ 






Schilling et al. (2006) [calculation] 












Al-Jarf (2004) 












Aberson et al. (2003) 












Schilling et al. (2006) [search strategies] 






* 






Spires et al. (2001) 












Zacharia (2007) 






■ 






Long and Jennings (2005) [wave 2 study] 






■ 






Gilliver, Randall and Pok (1998) 






• 






El-Deghaidy and Nouby (2008) 






■ 






O'Dwyer, Carey and Kleiman (2007) 






• 






Davis et al. (1999) [blended] 






■ 






Midmer. Kahan and Marlow (2006) 






• 






Urban (2006) 






■ 






Caldwell (2006) [blended] 






■ 






Maki and Maki (2002) 












Suter and Peny (1997) 






■ 






Frederickson, Reed and Clifford (2005) 






* 






DeBord, Aruguete and Muhlig (2004) 






■ 






Long and Jennings (2005) [wave 1 study] 






» 






Rockman et al. (2007) [multiple choice] 




■ 








Rockman et al. (2007) [writing] 




■ 










-1 


0 




2 



Exhibit reads: The effect size estimate for Schoenfeid-Tacher, McConneii and Graham (2001) was +0.80 
with a 95 percent probability that the true effect size lies between -0.10 and +1 .70. 
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Exhibit 4a. Pureiy Oniine Versus Face-to-Face (Category 1) Studies inciuded in the Meta-Anaiysis 



95-Percent Test of Null Retention 

Confidence Hypothesis Rate 



Authors 


Title 


Effect Size 


Interval 


(2-tail) 


(percentage) 


Number 










Lower 


Upper 






Face-to- 


of Units 






a 


SE 


Limit 


Limit 


Z-Value 


Online 


Face 


Assigned^ 


Beeckman et al. 
(2008) 


Pressure ulcers: E-learning to improve 
classification by nurses and nursing students 


-rO.187 


0.137 


-0.082 


0.455 


1.36 


Unknown 


Unknown 


426 

participants 


Bello et al. (2005) 


Online vs. live methods for teaching difficult 




















airway management to anesthesiology residents 


-rO.210 


0.264 


-0.308 


0.728 


0.79 


100 


100 


participants 


Benjamin et al. 
(2007) 


A randomized controlled trial comparing Web to 
in-person training for child care health 
consultants 


-rO.046 


0.340 


-0.620 


0.713 


0.14 


Unknown 


Unknown 


23 

participants 


Beyea et al. (2008) 


Evaluation of a particle repcsitioning maneuver 
Web-based teaching medule 


-rO.790 


0.493 


-0.176 


1.756 


1.60 


Unknown 


Unknown 


17-20 

participants*^ 


Caldwell (2006) 


A comparative study of traditional, Web-based 
and online instructicnal modalities in a computer 




















programming course 


-rO.132 


0.310 


-0.476 


0.740 


0.43 


100 


100 


60 students 


Cavus, Uzonboylu 
and Ibrahim (2007) 


Assessing the success rate of students using a 
learning management system together with a 
collaborative tool in Web-based teaching of 
programming languages 


-1-0.466 


0.335 


-0.190 


1.122 


1.39 


Unknown 


Unknown 


54 students 


Davis et al. (1999) 


Developing online ccurses: A comparison of 
Web-based instructicn with traditional instruction 


-rO.379 


0.339 


-0.285 


1.042 


1.12 


Unknown 


Unknown 


2 courses/ 
classrcoms 


Hairston (2007) 


Employees' attitudes toward e-learning: 
Implications for policy in industry environments 


-rO.028 


0.155 


-0.275 


0.331 


0.18 


70 


58.33 


1 68 participants 


Harris et al. (2008) 


Educating generalist physicians abeut chronic 
pain with live experts and online education 


-rO.285 


0.252 


-0.209 


0.779 


1.13 


84.21 


94.44 


62 participants 


Hugenholtzet al. 
(2008) 


Effectiveness of e-learning in centinuing medical 
education for occupational physicians 


-rO.111 


0.233 


-0.346 


0.569 


0.48 


Unknown 


Unknown 


72 participants 


Jang et al. (2005) 


Effects of a Web-based teaching methed on 
undergraduate nursing students' learning of 
electrocardiography 


-rO.530 


0.197 


0.143 


0.917 


2.69** 


85.71 


87.93 


105 students 
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Exhibit 4a. Pureiy Oniine Versus Face-to-Face (Category 1) Studies inciuded in the Meta-Anaiysis (continued) 











95-Percent 


Test of Nuli 


Retention 




Authors 


Title 


Effect Size 


Confidence 

Intervai 


Hypothesis 

(2-taii) 


Rate 

(percentage) 


Number 
of Units 










Lower 


Upper 






Face-to- 






a 


SE 


Limit 


Limit 


Z-Value 


Oniine 


Face 


Assigned^ 


LaRose, Gregg and 
Eastin (1998) 


Audiographic telecourses for the Web: An 
experiment 


-1-0.070 


0.281 


-0.481 


0.621 


0.25 


Unknown 


Unknown 


49 students 


Lowry (2007) 


Effects of online versus face-to-face 
professional development with a team-based 
learning community approach on teachers’ 
application of a new instructional practice 


-0.281 


0.335 


-0.937 


0.370 


-0.84 


80 


93.55 


53 students 


Mentzer, Cryan and 

Teclehaimanot 

(2007) 


A comparison of face-to-face and Web-based 
classrooms 


-0.796 


0.339 


-1.460 


-0.131 


-2.35* 


Unknown 


Unknown 


36 students 


Nguyen et al. 


Randomized controlled trial of an Internet-based 


















(2008) 


versus face-to-face dyspnea self-management 
program for patients with chronic obstructive 
pulmonary disease: Pilot study 


-1-0.292 


0.316 


-0.327 


0.910 


0.93 


Unknown 


Unknown 


39 

participants 


Ocker and 
Yaverbaum (1999) 


Asynchronous computer-mediated 
communicatien versus face-to-face 
collaboration: Results on student learning, 
quality and satisfaction 


-0.030 


0.214 


-0.449 


0.389 


-0.14 


Unknown 


Unknown 


43 students 


Padalino and Peres 
(2007) 


E-learning: A comparative study for knowledge 
apprehension among nurses 


0.115 


0.281 


-0.437 


0.666 


0.41 


Unknown 


Unknown 


49 

participants 


Peterson and Bond 


Online compared to face-to-face teacher 


















(2004) 


preparation for learning standards-based 
planning skills 


-1-0.100 


0.214 


-0.320 


0.520 


0.47 


Unknown 


Unknown 


4 sections 


Schmeeckle (2003) 


Online training: An evaluation of the 
effectiveness and efficiency of training law 
enforcement personnel over the Internet 


-0.106 


0.198 


-0.494 


0.282 


-0.53 


Unknown 


Unknown 


101 students 


Schoenfeld-T acher, 


Do no harm: A comparison of the effects of 


















McConnell and 


online vs. traditional delivery media on a science 


















Graham (2001) 


course 


-1-0.800 


0.459 


-0.100 


1.700 


1.74 


100 


99.94 


Unknown 
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Exhibit 4a: Pureiy Oniine versus Face-to-Face (Category 1) Studies inciuded in the Meta-anaiysis (continued) 











95-Percent 


Test of Null 


Retention 




Authors 


Title 


Effect Size 


Confidence 

Intervai 


Hypothesis 

(2-tail) 


Rate 

(percentage) 


Number 
of Units 










Lower 


Upper 






Face-to- 






a 


SE 


Limit 


Limit 


Z-Value 


Oniine 


Face 


Assigned^ 


Sexton, Raven and 


A comparison of traditional and World Wide 


















Newman (2002) 


Web methodologies, computer anxiety, and 
higher order thinking skills in the inservioe 
training of Mississippi 4-H extension agents 


-0.422 


0.385 


-1.177 


0.332 


-1.10 


Unknown 


Unknown 


26 students 


Sun, Lin and Yu 
(2008) 


A study on learning effect among different 
learning styles in a Web-based lab of science for 
elementary school students 


+0.180 


0.187 


-0.187 


0.547 


0.96 


Unknown 


Unknown 


4 classrooms 


Turner et al. (2006) 


Web-based learning versus standardized 
patients for teaching clinical diagnosis: A 
randomized, controlled, crossover trial 


+0.242 


0.367 


-0.477 


0.960 


0.66 


Unknown 


Unknown 


30 students 


Vandeweerd et al. 
(2007) 


Teaching veterinary radiography by e-learning 
versus structured tutorial: A randomized, single- 
blinded controlled trial 


+0.144 


0.207 


-0.262 


0.550 


0.70 


Unknown 


Unknown 


92 students 


Wallace and 
Clariana (2000) 


Achievement predictors for a computer- 
applications module delivered online 


+0.109 


0.206 


-0.295 


0.513 


0.53 


Unknown 


Unknown 


4 sections 


Wang (2008) 


Developing and evaluating an interactive 
multimedia instructional tool: Learning outcomes 
and user experiences of optometry students 


-0.071 


0.136 


-0.338 


0.195 


-0.53 


Unknown 


Unknown 


4 seotions'^ 


Zhang (2005) 


Interactive multimedia-based e-learning: A study 
of effectiveness 


+0.381 


0.339 


-0.283 


1.045 


1.12 


Unknown 


Unknown 


51 students 


Zhang et al. (2006) 


Instructional video in e-learning: Assessing the 
effect of interactive video on learning 
effectiveness 


+0.499 


0.244 


0.022 


0.977 


2.05* 


Unknown 


Unknown 


69 students 



Exhibit reads: The effect size for the Hugenholtz et al. (2008) study of online medical education was +0.1 1, which was not significantly different from 0. 

*p < .05, ** p < .01 , SE = Standard error 

® The number given represents the assigned units at study conclusion. It excludes units that attrited. 

Two outcome measures were used to compute one effect size. The first outcome measure was oompleted by 17 partioipants, and the second outcome measure was 
completed by 20 partioipants. 

This study is a crossover study. The number of units represents those assigned to treatment and control conditions in the first round. 
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Exhibit 4b. Biended Versus Face-to-Face (Category 2) Studies inciuded in the Meta-Anaiysis 











95-Percent 


Test of Null 


Retention 




Authors 


Title 


Effect Size 


Confidence 

Interval 


Hypothesis 

(2-tail) 


Rate 

(percentage) 




















Face- 


Number 






9 


SE 


Lower 

Limit 


Upper 

Limit 


Z-Value 


Online 


to- 

Face 


of Units 
Assigned^ 


Aberson et al. 


Evaluation of an Interactive tutorial for teaching 














.75 




(2003) 


hypothesis testing concepts 


+ 0.700 


0.404 


-0.092 


1.492 


1.73 


Unknown 




2 sections 


Al-Jarf (2004) 


The effects of Web-based learning on struggling 
EFL college writers 


+ 0.740 


0.194 


0.360 


1.120 


3.82*** 


Unknown 


Unknown 


113 students 


Caldwell (2006) 


A comparative study of traditional, Web-based 
and online instructional modalities in a cemputer 




















programming course 


+0.251 


0.311 


-0.359 


0.861 


0.81 


100 


100 


60 students 


Davis et al. (1999) 


Developing online courses: A comparison of 
Web-based instructicn with traditional instruction 


+0.335 


0.338 


-0.327 


0.997 


0.99 


Unknown 


Unknown 


2 courses/ 
classrooms 


Day, Raven and 


The effects of World Wide Web instructicn and 


















Newman (1998) 


traditional instruction and learning styles on 
achievement and changes in student attitudes in 
a technical writing in agricommunication course 


+1.113 


0.289 


0.546 


1.679 


3.85*** 


89.66 


96.55 


2 sections 


DeBord, Aruguete 
and Muhllg (2004) 


Are computer-assisted teaching metheds 
effective? 


+0.130 


0.188 


-0.239 


0.499 


0.69 


Unknown 


Unknown 


112 students 


El-Deghaldy and 
Nouby (2008) 


Effectiveness of a blended e-learning 
cooperative approach in an Egyptian teacher 
education program 


+0.475 


0.386 


-0.282 


1.232 


1.23 


Unknown 


Unknown 


26 students 


Englert et al. (2007) 


Scaffolding the writing of students with 
disabilities through procedural facilitation using 
an Internet-based technology 


+0.740 


0.345 


0.064 


1.416 


2.15* 


Unknown 


Unknown 


6 classrooms 
from 
5 urban 
schools 


Fredericksen, Reed 
and Clifford (2005) 


Evaluating Web-supported learning versus 
lecture-based teaching: Quantitative and 
qualitative perspectives 


+0.138 


0.345 


-0.539 


0.814 


0.40 


Unknown 


Unknown 


2 sections 


Gllllver, Randall and 
Pok(1998) 


Learning in cyberspace: Shaping the future 


+0.477 


0.111 


0.260 


0.693 


4.31*** 


Unknown 


Unknown 


24 classes 


Long and Jennings 
(2005) [Wave 1]" 


The effect of technology and professional 
development on student achievement 


+0.025 


0.046 


-0.066 


0.116 


0.53 


Unknown 


Unknown 


9 schools 
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Exhibit 4b: Biended versus Face-to-Face (Category 2) Studies inciuded in the Meta-anaiysis (continued) 



95-Percent Test of Null 

Effect Size Confidence Hypothesis Retention Rate 

Authors Title Interval (2-tail) (percentage) Number 







g 


SE 


Lower 

Limit 


Upper 

Limit 


Z-Value 


Online 


Face-to- 

Face 


of Units 
Assigned^ 


Long and Jennings 
(2005) [Wave 2] " 


The effect of technology and professional 
development on student achievement 


-1-0.554 


0.098 


0.362 


0.747 


5.65’*** 


Unknown 


Unknown 


6 teachers 


Maki and Maki 
(2002) 


Multimedia comprehension skill predicts 
differential outcomes of Web-based and lecture 
courses 


-rO.171 


0.160 


-0.144 


0.485 


1.06 


91.01 


88.10 


155 students 


Midmer, Kahan and 
Marlow (2006) 


Effects of a distance learning program on 
physicians’ opioid- and benzodiazepine- 
prescribing skills 


-rO.332 


0.213 


-0.085 


0.750 


1.56m 


Unknown 


Unknown 


88 students 


O’ Dwyer, Carey 
and Kleiman (2007) 


A study of the effectiveness of the Louisiana 
algebra 1 online course 


-rO.373 


0.094 


0.190 


0.557 


3.99’*** 


88.51 


64.4 


Unknown*’ 


Bookman et al. 
(2007) [Writing] " 


ED PACE final report 


-0.239 


0.102 


-0.438 


-0.039 


-2.34* 


Unknown 


Unknown 


28 

classrooms 


Bookman et al. 
(2007) [Multiple- 
choice test] 


ED PACE final report 


-0.146 


0.102 


-0.345 


0.054 


-1.43 


Unknown 


Unknown 


28 

classrooms 


Schilling et al. 
(2006) [Search 
strategies] 


An interactive Web-based curriculum on 
evidence-based medicine: Design and 
effectiveness 


-rO.585 


0.188 


0.216 


0.953 


3.11** 


68.66 


59.62 


Unknown 


Schilling et al. 
(2006) [Quality of 
care calculation]'^ 


An interactive Web-based curriculum on 
evidence-based medicine: Design and 
effectiveness 


-rO.926 


0.183 


0.567 


1.285 


5.05*** 


66.42 


86.54 


Unknown 


Spires et al. (2001) 


Exploring the academic self within an electronic 
mail environment 


-rO.571 


0.357 


-0.130 


1.271 


1.60 


Unknown 


100.00 


31 students 


Suter and Perry 
(1997) 


Evaluation by electronic mail 


-rO.140 


0.167 


-0.188 


0.468 


0.84 


Unknown 


Unknown 


Unknown 
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Exhibit 4b: Biended versus Face-to-Face (Category 2) Studies inciuded in the Meta-anaiysis (continued) 











95-Percent 


Test of Null 












Effect Size 


Confidence 


Hypothesis 


Retention Rate 




Authors 


Titie 






Interval 


(2-tail) 


(percentage) 


Number 










Lower 


Upper 






Face-to- 


of Units 






Q 


SE 


Limit 


Limit 


Z-Value 


Online 


Face 


Assigned^ 


Urban (2006) 


A comparison of computer-based distance 
education and traditionai tutoriai sessions in 
suppiementai instruction for students at-risk for 
academic difficuities 


+0.264 


0.192 


-0.112 


0.639 


1.37 


96.86 


73.85 


110 students 


Zacharia (2007) 


Comparing and combining reai and virtuai 
experimentation: An effort to enhance students’ 
conceptuai understanding of eieotric oircuits 


+0.570 


0.216 


0.147 


0.993 


2.64** 


100 


95.56 


88 students 



Exhibit reads: The effect size for the Aberson et al. (2003) study of an interactive tutoriai on hypothesis testing was +0.70, whioh was not significantiy different from 0. 
*p < .05, ** p < .01 , *** p < .001 , SE = Standard error. 

®This number represents the assigned units at study conoiusion. It exoiudes units that attrited. 

"^The study invoived 18 oniine ciassrooms from six districts and two private sohoois; the same six distriots were asked to identify comparabie face-to-face ciassrooms, 
but the study does not report how many of those ciassrooms participated. 

'^Two independent contrasts were contained in this articie, which therefore appears twioe in the tabie. 
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Test for Homogeneity 

Both the Category 1 and Category 2 studies contrasted a condition with online elements with a 
condition of face-to-face instruction only. Analysts used the larger corpus of 5 1 effects that were 
either Category 1 or Category 2 to explore the influence of possible moderator variables. 

The individual effect size estimates included in this meta- analysis ranged from a low of -0.80 
(tendency for higher performance in the face-to-face condition) to a high of -i- 1.1 1 (favoring 
online instruction). A test for homogeneity of effect size found significant differences across 
studies {Q = 145.58, p < .0001). Because of these significant differences in effect sizes, analysts 
investigated the variables that may have influenced the differing effect sizes. 

Analyses of Moderator Variables 

As noted in chapter 1, this meta-analysis has distinguished between practice variables, which can 
be considered part of intervention implementation, and conditions, which are status variables that 
are fairly impervious to outside influence. Relying on prior research, the research team identified 
variables of both types that might be expected to correlate with the effectiveness of online 
learning. The researchers also considered the potential influence of study method variables, 
which often vary with effect size; typically, more poorly controlled studies show larger effects. 
Each study in the meta-analysis was coded for these three types of variables — practice, status, 
and study method — using the coding categories shown in the appendix. 

Many of the studies did not provide information about features considered to be potential 
moderator variables, a predicament noted in previous meta- analyses (see Bernard et al. 2004). 
Many of the reviewed studies, for example, did not indicate (a) whether or not the online 
instructor had received training in the method of instruction, (b) rates of attrition from the 
contrasting conditions and (c) contamination between conditions. 

For some of the variables, the number of studies providing sufficient information to support 
categorization as to whether or not the feature was present was too small to support a meaningful 
analysis. Analysts identified those variables for which at least two contrasting subsets of studies, 
with each subset containing six or more study effects, could be constructed. In some cases, this 
criterion could be met by combining related feature codes; in a few cases, the inference was 
made that failure to mention a particular practice or technology (e.g., one-way video) denoted its 
absence. Practice, conditions and method variables for which study subsets met the size criterion 
were included in the search for moderator variables. 
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Practice Variables 



Exhibit 5 shows the variation in effectiveness associated with 12 practice variables. Analysis of 
these variables addresses the third research question: 

What practices are associated with more ejfective online learning? 

Exhibit 5 and the two data exhibits that follow show significance results both for the various 
subsets of studies considered individually and for the test of the dimension used to subdivide the 
study sample (i.e., the potential moderator variable). Eor example, in the case of Computer- 
Mediated Communication With Peers, both the 17 contrasts in which students in the online 
condition had only asynchronous communication with peers and the 7 contrasts in which online 
students had both synchronous and asynchronous communication with peers are shown in the 
table. The two subsets had mean effect sizes of -1-0.27 and -1-0.32, respectively, and both were 
statistically different from 0. The Q-statistic of homogeneity tests whether the variability in 
effect sizes for these contrasts is associated with the type of peer communication available. The 
2-statistic for Computer-Mediated Communication With Peers (0.13) is not statistically different 
from 0, which is unsurprising because both studies of online learning with only asynchronous 
communication and those with both asynchronous and synchronous communication found 
similar positive effects on average. 

The test of the moderator variable most central to this study — whether a blended online condition 
including face-to-face elements is associated with greater advantages over classroom instruction 
than is pure online learning — was discussed above. As noted there, the effect size for blended 
approaches contrasted against face-to-face instruction is larger than that for purely online 
approaches contrasted against face-to-face instruction. The other two practice variables included 
in the chapter 1 conceptual framework — learning experience type and synchronous versus 
asynchronous communication with the instructor — were tested in a similar fashion. Neither was 
found to moderate significantly the size of the online learning effect. However, examination of 
the learning experience study subsets indicated that the mean effect size for studies where the 
online learning was instructor-directed expository (-1-0.36) and the mean effect size for 
collaborative, interactive instruction (- 1 - 0 . 28 ) were significantly positive whereas the mean effect 
size for independent, active online learning (-1-0.15) was not.'® 

Among the other 10 practices, which were not part of the conceptual model, only the amount of 
time that students in the treatment condition spent on task compared with students in the face-to- 
face condition proved to be a significant moderator. The mean effect size for studies with more 
time spent on task by online learners than learners in the control condition was -1-0.46 compared 
with -1-0.19 for studies in which the learners in the face-to-face condition spent as much time or 
more on task {Q = 3.88, p < .05). 



'® Online experiences in which students explored digital artifacts and controlled the specific material they wanted to 
view were categorized as “active” learning experiences. 

If the five K-12 studies are dropped from the meta-analysis corpus, the p value for this moderator variable rises to 

p < .06. 



28 




Exhibit 5. Tests of Practices as Moderator Variabies 



Variable 


Contrast 


Number 

Studies 


Weighted 
Effect Size 


Standard 

Error 


Lower 

Limit 


Upper 

Limit 


Q-Statistic 




instruotor-directed 

(expository) 


8 


0.363** 


0.115 


0.138 


0.588 




Pedagogy/learning 

experience® 


independent 

(active) 


17 


0.145 


0.077 


-0.005 


0.296 


3.03 




Coiiaborative 

(interactive) 


23 


0.283*** 


0.070 


0.146 


0.419 




Computer- 

mediated 


Asynchronous oniy 


16 


0.305*** 


0.095 


0.120 


0.491 




communication 
with instructor® 


Synchronous -i- 
Asynchronous 


9 


0.153 


0.123 


-0.089 


0.394 


0.97 


Computer- 

mediated 


Asynchronous oniy 


17 


0.268*** 


0.079 


0.113 


0.422 


n 1'^ 


communication 
with peers® 


Synchronous -i- 
Asynchronous 


7 


0.321** 


0.125 


0.076 


0.567 




Treatment 


Less than 1 month 


19 


0.227** 


0.082 


0.066 


0.389 




duration® 


More than 1 month 


30 


0.255*** 


0.063 


0.132 


0.378 


0.07 


Media features® 


Text-based oniy 


15 


0.281** 


0.100 


0.086 


0.477 


0.13 


Text - 1 - other media 


32 


0.239*** 


0.060 


0.121 


0.357 


Time on task® 


Oniine > Face to 
Face 


10 


0.461*** 


0.110 


0.246 


0.676 


3.88* 


Same or Face to 
Face > Oniine 


17 


0.189* 


0.084 


0.025 


0.353 


One-way video or 


Present 


15 


0.118 


0.082 


-0.043 


0.279 




audio 


Absent/Not reported 


36 


0.308*** 


0.057 


0.196 


0.421 


3.62 


Computer-based 

instruetion 

elements 


Present 


30 


0.263*** 


0.061 


0.144 


0.382 


0.20 


Absent/Not reported 


21 


0.220** 


0.077 


0.069 


0.371 


Opportunity for 
face-to-faoe time 
with instruotor 


During instruction 


21 


0.277*** 


0.069 


0.142 


0.411 




Before or after 
instruction 


12 


0.220* 


0.108 


0.009 


0.431 


0.37 




Absent/Not reported 


18 


0.217* 


0.086 


0.047 


0.386 




Opportunity for 
face-to-face time 
with peers 


During instruction 


21 


0.274*** 


0.068 


0.141 


0.408 




Before or after 
instruction 


13 


0.160 


0.102 


-0.040 


0.359 


0.94 




Absent/Not reported 


17 


0.266** 


0.089 


0.091 


0.442 




Opportunity to 


Present 


42 


0.264*** 


0.052 


0.161 


0.366 


0.65 


practice 


Absent/Not reported 


9 


0.159 


0.118 


-0.072 


0.391 


Feedback 


Present 


24 


0.248*** 


0.072 


0.107 


0.388 


0.00 


provided 


Absent/Not reported 


27 


0.247*** 


0.065 


0.118 


0.375 



Exhibit reads: Studies in which time spent in oniine iearning exceeded time in the face-to-face condition had a mean 
effect size of -t-0.46 compared with -t-0.19 for studies in which face-to-face iearners had as much or more instructionai 
time. 

*p< .05, **p< .01, ***p< .001. 

®The moderator anaiysis for this variabie exciuded studies that did not report information for this feature. 
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Condition Variabies 



The strategy to investigate whether study effect sizes varied with publication year, which was 
taken as a proxy for the sophistication of available technology, involved splitting the study 
sample into two nearly equal subsets by contrasting studies published between 1997 and 2003 
against those published in 2004 through July 2008. 

The studies were divided into three subsets of learner type: K-12 students, undergraduate and 
community college students (the largest single group), and other types of learners (graduate 
students or individuals receiving job-related training). As noted above, the studies covered a 
wide range of subjects, but medicine and health care were the most common. Accordingly, these 
studies were contrasted against studies in other fields. Tests of these conditions as potential 
moderator variables addressed the study’s fourth research question: 

What conditions influence the effectiveness of online learning? 

None of the three conditions tested emerged as a statistically significant moderator variable. In 
other words, for the range of student types for which studies are available, online learning 
appeared more effective than traditional face-to-face instruction in both older and newer studies, 
with undergraduate and older learners, and in both medical and other subject areas. Exhibit 6 
provides the results of the analysis of conditions. 



Exhibit 6. Tests of Conditions as Moderator Variabies 



Variabie 


Contrast 


Number of 
Contrasts 


Weighted 

Effect 

Size 


Standard 

Error 


Lower 

Limit 


Upper 

Limit 


0-Statistic 


Year 

Published 


1997-2003 


14 


0.266** 


0.095 


0.080 


0.453 


0.06 


2004 or after 


37 


0.240*** 


0.055 


0.133 


0.347 


Learner 

Type 


K-1 2 students 


7 


0.158 


0.101 


-0.040 


0.356 


3.70 


Undergraduate 


25 


0.345*** 


0.069 


0.209 


0.480 


Graduate 

Student/Other 


19 


0.172* 


0.077 


0.021 


0.324 


Subject 

Matter 


Medical/ Health 
care 


16 


0.302*** 


0.084 


0.136 


0.467 


0.63 


Other 


35 


0.221*** 


0.057 


0.110 


0.332 



Exhibit reads: The positive effect associated with oniine learning over face-to-face instruction was 
significant both for studies published between 1997 and 2003 and for those published in 2004 or later; the 
effect size does not vary significantly with period of publication. 

*p< .05, **p< .01, ***p< .001. 
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Because of the Evaluation of Evidence-Based Practices in Online Learning study’s emphasis on 
K-12 education, the online learning studies involving K-12 students were of particular interest. 
The meta-analysis includes seven contrasts from five studies of K-12 school students’ online 
learning. Exhibit 7 describes these studies. 

Given the small number of studies that addressed K-12 learners in the meta-analysis, attempts to 
test for statistical differences between the mean effect for K-12 learners and those for other types 
of learners should be viewed as merely suggestive. At - 1 -O.I 6 , the average of the seven contrasts 
involving K-12 learners appears similar to that for graduate and other students (-1-0.17) but less 
positive than that for undergraduates (- 1 - 0 . 35 ). When learner type was tested as a moderator 
variable, however, the resulting Q-statistic was not significant. 

Methods Variables 

The advantage of meta-analysis is its ability to uncover generalizable effects by looking across a 
range of studies that have operationalized the construct under study in different ways, studied it 
in different contexts, and used different methods and outcome measures. However, the inclusion 
of poorly designed and small-sample studies in the meta-analysis corpus poses concerns because 
doing so may give undue weight to spurious effects. Study methods variables were examined as 
potential moderators to explore this issue. The results are shown in Exhibit 8 . 

The influence of study sample size was examined by dividing studies into three subsets, 
according to the number of learners for which outcome data were collected. Sample size was not 
found to be a statistically significant moderator of online learning effects. Thus, there is no 
evidence that the inclusion of small-sample studies in the meta-analysis was responsible for the 
overall finding of a positive outcome for online learning. In fact, when the studies were 
segmented into sample-size categories, it was the studies with the largest sample sizes for which 
the most positive effects were found. 

Comparisons of the three designs deemed acceptable for this meta-analysis (random-assignment 
experiments, quasi-experiments with statistical control and crossover designs) indicate that study 
design is not significant as a moderator variable (see Exhibit 8 ). Moreover, in contrast with early 
meta-analyses in computer-based instruction, where effect size was inversely related to study 
design quality (Pearson et al. 2005), those experiments that used random assignment in the 
present corpus produced large positive effects (p < . 001 ). 
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Exhibit 7. Studies of Oniine Learning invoiving K-12 Students 



The meta-analysis study corpus for this meta-analysis included five articles reporting on studies 
involving K-12 students. All of these studies compared student learning in a blended condition 
with student learning in a face-to-face condition. One of the studies (Long and Jennings 2005, 
Wave 1 study) was a randomized control trial and the others were quasi-experiments. One of the 
quasi-experiments (Rockman et al. 2007) provided two effect sizes that favored the face-to-face 
condition; the other studies provided five effects favoring the blended condition (with a range 
from -1-0.03 to -1-0.74). 

Rockman et al. (2007) used a quasi-experimental matched comparison design to evaluate the 
effectiveness of Spanish courses offered to middle schools (seventh and eighth grades) through 
the West Virginia Virtual School. This virtual school program used a blended model of 
instruction that combined face-to-face and virtual instruction as well as paper and pencil and 
Web-based activities. The program was delivered by a three-member teacher team that included 
a lead teacher (a certified Spanish teacher) who was responsible for the design and delivery of 
the daily lesson plan and weekly phone conversations with each class; an adjunct teacher (a 
certified Spanish teacher) who provided content-related feedback by means of e-mail and voice- 
mail and who graded student tests and products; and a classroom facilitator (a certified teacher, 
but not a Spanish teacher) who guided students on site to ensure that they stayed on task and 
completed assignments on time. The hybrid Spanish course was offered to students in 21 schools 
that did not have the resources to provide face-to-face Spanish instruction. The students in the 
face-to-face group came from seven schools that matched the virtual schools with respect to 
average language arts achievement and school size. The study involved a total of 463 students. 

Information needed to compute effect sizes was reported for two of the student learning 
measures used in the study. For the first of these, a multiple-choice test including subtests on oral 
and written comprehension of Spanish, the mean estimated effect was -0.15, and the difference 
between the two conditions was not statistically significant. The other measure was a test of 
students’ writing ability, and the effect size for this skill was -0.24, with students receiving face- 
to-face instruction doing significantly better than those receiving the online blended version of 
the course. 

Contrasting results were obtained in the other large-scale K-12 study, conducted by O’Dwyer, 
Carey and Kleiman (2007). These investigators used a quasi-experimental design to compare the 
learning of students participating in the Louisiana Algebra I Online initiative with the learning of 
students in comparison classrooms that were “similar with regard to mathematics ability, 
environment, and size, but where teachers used traditional ‘business as usual’ approaches to 
teaching algebra” (p. 293). Like the West Virginia Virtual School program, this initiative used a 
blended model of instruction that combined face-to-face and Web-based activities with two 
teachers: one in class and the other online. Matched pre- and posttest scores on researcher- 
developed multiple-choice tests were collected from a total of 463 students (23 1 from the 
treatment group, 232 from the comparison group) from multiple schools and school districts. An 
effect size of -1-0.37 was obtained, with online students performing better than their peers in 
conventional classrooms. 
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Exhibit 7. Studies of Oniine Learning invoiving K-12 Students (continued) 



Long and Jennings (2005) examined whether the performance of eighth-grade students whose 
teachers integrated the use of the Pathways to Freedom Electronic Field Trips — an online 
collection of interactive activities designed by Maryland Public Television — improved compared 
with performance of students whose teachers taught the same content without the online 
materials. The study provided two sets of analyses from two waves of data collection, yielding 
two independent effect sizes. The first set of analyses involved the data from nine schools in two 
Maryland districts. Schools were assigned randomly to conditions. Teachers in both conditions 
covered the same learning objectives related to slavery and the Underground Railroad, with the 
treatment teachers using the Pathways to Freedom Electronic Field Trips materials. A small 
effect size of -1-0.03 favoring the online condition was computed from change scores on 
researcher-developed multiple-choice tests administered to 971 students. 

Fong and Jennings’ (2005, wave 2) second study involved a subset of teachers from one of the 
two participating districts, which was on a semester schedule. The teachers from this district 
covered the same curriculum twice during the year for two different sets of students. The gain 
scores of 846 students of six teachers (three treatment teachers and three control teachers) from 
both semesters were collected. Regression analysis indicated an effect size of -1-0.55 favoring the 
online conditions. This study also looked into the maturation effects of teachers’ using the online 
materials for the second time. As hypothesized, the results showed that the online materials were 
used more effectively in the second semester. 

Sun, Fin and Yu (2008) conducted a quasi-experimental study to examine the effectiveness of a 
virtual Web-based science lab with 113 fifth-grade students in Taiwan. Although both treatment 
and control groups received an equal number of class hours and although both groups conducted 
manual experiments, students in the treatment condition used the virtual Web-based science lab 
for part of their lab time. The Web-based lab enabled students to conduct virtual experiments 
while teachers observed student work and corrected errors online. The control group students 
conducted equivalent experiments using conventional lab equipment. Matched pre- and posttest 
scores on researcher-developed assessments were collected for a total of 1 13 students (56 from 
the treatment group and 57 from the comparison group) in four classrooms from two randomly 
sampled schools. An effect size of -1-0.18 favoring the virtual lab condition was obtained from 
analysis of covariance results, controlling for pretest scores. 

A small-scale quasi-experiment was conducted by Englert et al. (2007). This study examined the 
effectiveness of a Web-based writing support program with 35 elementary-age students from six 
special education classrooms across five urban schools. Students in the treatment group used a 
Web-based program that supported writing performance by prompting attention to the topical 
organization and structure of ideas during the planning and composing phases of writing. Control 
students used similar writing tools provided in traditional paper- and-pencil formats. Pre- and 
posttests of student writing, scored on a researcher-developed rubric, were used as outcome 
measures. An effect size of -1-0.74 favoring the online condition was obtained from an analysis of 
covariance controlling for writing pretest scores. 
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Exhibit 8. Tests of Study Features as Moderator Variabies 



Variabie 


Contrast 


Number 

of 

Studies 


Weighted 

Effect 

Size 


Standard 

Error 


Lower 

Limit 


Upper 

Limit 


0-Statistic 


Sample size 


Fewer than 35 


11 


0.312* 


0.133 


0.051 


0.574 


0.28 


From 35 to 100 


21 


0.240** 


0.080 


0.083 


0.396 


More than 100 


19 


0.235*** 


0.066 


0.106 


0.364 


Type of 

knowledge 

tested® 


Declarative 


13 


0.191* 


0.089 


0.017 


0.365 


1.08 


Procedural/ 
Procedural and 
declarative 


29 


0.293*** 


0.065 


0.166 


0.419 


Strategic 

knowledge 


5 


0.335* 


0.158 


0.025 


0.644 


Study design 


Random 

assignment 

control 


33 


0.279*** 


0.061 


0.158 


0.399 


0.71 


Quasi- 
experimental 
design with 
statistical control 


13 


0.203* 


0.091 


0.025 


0.381 


Crossover 

design 


5 


0.178 


0.151 


-0.117 


0.473 


Unit of 
assignment 
to conditions® 


Individual 


33 


0.207*** 


0.060 


0.088 


0. 325 


5.18 


Class section 


7 


0.517*** 


0.129 


0.264 


0.770 


Course/School 


9 


0.190* 


0.093 


0.009 


0.372 


Instructor 

equivalence® 


Same instructor 


20 


0.227** 


0.072 


0.086 


0.368 


0.67 


Different 

instructor 


20 


0.146* 


0.068 


0.013 


0.280 


Equivalence 
of curriculum/ 
instruction® 


Identical/ 
Almost identical 


30 


0.200*** 


0.056 


0.091 


0.309 


5.40* 


Different/ 

Somewhat 

different 


17 


0.418*** 


0.075 


0.270 


0.566 



Exhibit reads: The average effect size was significantly positive for studies with a sample size of less 
than 35 as well as for those with 35 to 100 and those with a sample size larger than 100; the weighted 
average effect did not vary with size of the study sample. 

*p< .05, **p< .01, ***p< .001. 

®The moderator analysis excluded some studies because they did not report information about this 
feature. 



34 





Effect sizes do not vary depending on whether or not the same instructor or instructors taught in 
the face-to-face and online conditions (Q = 0.67, p > .05). The average effect size for the 20 
contrasts in which instructors were the same across conditions was -1-0.23, p < .01. The average 
effect size for contrasts in which instructors varied across conditions was -i-0.15,p < .05. The 
only study method variable that proved to be a significant moderator of effect size was 
comparability of the instructional materials and approach for treatment and control students. 

The analysts coding study features examined the descriptions of the instructional materials and 
the instructional approach for each study and coded them as “identical,” “almost identical,” 
“different” or “somewhat different” across conditions. Adjacent coding categories were 
combined (creating the two study subsets Identical/ Almost Identical and Different/Somewhat 
Different) to test Equivalence of Curriculum/Instruction as a moderator variable. Equivalence of 
Curriculum/Instruction was a significant moderator variable (Q = 5.40, p < .05). An examination 
of the study subgroups shows that the average effect for studies in which online learning and 
face-to-face instruction were described as identical or nearly so was - 1 - 0 . 20 , p < .01, compared 
with an average effect of -1-0.42 (p < .001) for studies in which curriculum materials and 
instructional approach varied across conditions. 

A marginally significant effect was found for the unit assigned to treatment and control 
conditions. Effects tended to be smaller in studies in which individual students or sections, rather 
than whole courses or schools, were assigned to online and face-to-face conditions (Q = 5.18, 

p< .10)‘^ 

The moderator variable analysis for aspects of study method also found additional patterns that 
did not attain statistical significance but that should be re-tested once the set of available rigorous 
studies of online learning has expanded. The type of learning outcome tested, for example, may 
influence the magnitude of effect sizes. Thirteen studies measured declarative knowledge 
outcomes only, typically through multiple-choice tests. A larger group of studies (29) looked at 
students’ ability to perform a procedure, or they combined procedural and declarative knowledge 
outcomes in their learning measure. Eive studies used an outcome measure that focused on 
strategic knowledge. (Eour studies did not describe their outcome measures in enough detail to 
support categorization.) Among the subsets of studies, the average effect for studies that included 
procedural knowledge in their learning outcome measure (effect size of -1-0.29) and that for 
studies that measured strategic knowledge (effect size of -1-0.34) appeared larger than the mean 
effect size for studies that used a measure of declarative knowledge only (-1-0.19). Even so, the 
Type of Knowledge Tested was not a significant moderator variable (Q= 1.08, p > .05). 



’ This moderator variable is statistically significant if the five K-12 studies are excluded from the analysis. 
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4. Narrative Synthesis of Studies 
Comparing Variants of Online Learning 



This chapter presents a narrative summary of Category 3 studies — those that examined the 
learning effects of variations in online practiees such as different versions of blended instruction 
or online learning with and without immediate feedback to the learner. The literature seareh and 
sereening (described in ehapter 2) identified 84 Category 3 studies reported in 79 articles.'® 

Within the set of Category 3 studies, five used K-12 students as subjeets and 10 involved K-12 
teacher education or professional development. College undergraduates constituted the most 
eommon learner type (see Exhibit 9). All Category 3 studies involved formal education. Course 
eontent for Category 3 studies eovered a broad range of subjeets, ineluding observation skills, 
understanding Internet seareh engines, HIV/AIDS knowledge and statisties. 

When possible, the treatment manipulations in Category 3 studies were coded using the practice 
variable categories that were used in the meta-analysis to facilitate comparisons of findings 
between the meta-analysis and the narrative synthesis. No attempt was made to statistieally 
eombine Category 3 study results, however, beeause of the wide range of eonditions eompared in 
the different studies. 



Exhibit 9. Learner Types for Category 3 Studies 



Educational Level 


Number of Studies 


K-12 


5 


Undergraduate 


37 


Graduate 


4 


Medical® 


18 


Teacher professional development"" 


10 


Adult training 


4 


Other® 


4 


Not available 


2 


Total 


84 



Exhibit reads: K-12 students were the learners in 5 of the 84 studies of 
alternative online practices. 

®The medical category spans undergraduate and graduate educational levels 
and includes nursing and related training. 

‘"Teacher professional development includes preservice and inservice training. 
■"The Other category includes populations consisting of a combination of 
learner types such as student and adult learners or undergraduate and 
graduate learners. 



'® Some articles contained not only contrasts that fit the criteria for Category 1 or 2 but also contrasts that fit 
Category 3. The appropriate contrasts between online and face-to-face conditions were used in the meta-analysis; 
the other contrasts were reviewed as part of the Category 3 narrative synthesis presented here. 
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Blended Compared With Pure Online Learning 

The meta-analysis of Category 1 and 2 studies described in chapter 3 found that effect sizes were 
larger for studies that compared blended learning conditions with face-to-face instruction than 
for studies that compared purely online learning with face-to-face instruction. Another way to 
investigate the same issue is by conducting studies that incorporate both blended and purely 
online conditions to permit direct comparisons of their effectiveness. 

The majority of the 10 Category 3 studies that directly compared purely online and blended 
learning conditions found no significant differences in student learning. Seven studies found no 
significant difference between the two, two found statistically significant advantages for purely 
online instruction, and one found an advantage for blended instruction. The descriptions of some 
of these studies, provided below, make it clear that although conditions were labeled as 
“blended” or “purely online” on the basis of their inclusion or exclusion of face-to-face 
interactions, conditions differed in terms of content and quality of instruction. Across studies, 
these differences in the nature of purely online and blended conditions very likely contributed to 
the variation in outcomes. 

Keefe (2003), for example, contrasted a section of an organizational behavior course that 
received lectures face-to-face with another section that watched narrated PowerPoint slides 
shown online or by means of a CD-ROM. Both groups had access to e-mail, online chat rooms, 
and threaded discussion forums. All course materials were delivered electronically to all students 
at the same time. On the course examination, students in the purely online section scored almost 
8 percent lower than those receiving face-to-face lectures in addition to the online learning 
activities. Keefe’s was the only study in the review that found a significant decrement in 
performance for the condition without face-to-face instructional elements. 

Poirier and Feldman (2004) compared a course that was predominantly face-to-face but also used 
an online discussion board with a course taught entirely online. Students in the predominantly 
face-to-face version of the course were required to participate in three online discussions during 
the course and to post at least two comments per discussion to an online site; the site included 
content, communication and assessment tools. In the purely online version of the course, students 
and the instructor participated in two online discussions each week. Poirier and Feldman found a 
significant main effect favoring the purely online course format for examination grades but no 
effect on student performance on writing assignments. 

Campbell et al. (2008) compared a blended course (in which students accessed instruction online 
but attended face-to-face discussions) with a purely online course (in which students accessed 
instruction and participated in discussions online). Tutors were present in both discussion 
formats. Students were able to select the type of instruction they wanted, blended or online. 

Mean scores for online discussion students were significantly higher than those for the face-to- 
face discussion group. 

As a group, these three studies suggest that the relative efficacy of blended and purely online 
learning approaches depends on the instructional elements of the two conditions. For the most 
part, these studies did not control instructional content within the two delivery conditions (blend 
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of online and face-to-face versus online only). For example, the lecturer in the Keefe (2003) 
study may have covered material not available to the students reviewing the lecture’s PowerPoint 
slides online. Alternately, in the Poirier and Feldman (2004) study, students interacting with the 
instructor in two online discussions a week may have received more content than did those 
receiving face-to-face lectures. 

Davis et al. (1999) attempted to equate the content delivered in their three class sections (online, 
traditional face-to-face, and a blended condition in which students and instructor met face-to- 
face but used the online modules). Students in an educational technology course were randomly 
assigned to one of the three sections. No significant differences among the three conditions were 
found in posttest scores on a multiple-choice test. 

An additional six studies contrasting purely online conditions and blended conditions (without 
necessarily equating learning content across conditions) also failed to find significant differences 
in student learning. Ruchti and Odell (2002) compared test scores from two groups of students 
taking a course on elementary science teaching methods. One group took online modules; the 
other group received instruction in a regular class, supplemented with an online discussion board 
and journal (also used in the online course condition). No significant difference between the 
groups was found. 

Beile and Boote (2002) compared three groups: one with face-to-face instruction alone, another 
with face-to-face instruction and a Web-based tutorial, and a third with Web-based instruction 
and the same Web-based tutorial. The final quiz on library skills indicated no significant 
differences among conditions. 

Gaddis et al. (2000) compared composition students’ audience awareness between a blended 
course and a course taught entirely online. The same instructor taught both groups, which also 
had the same writing assignments. Both groups used networked computers in instruction, in 
writing and for communication. However, the “on campus” group met face-to-face, giving 
students the opportunity to communicate in person, whereas the “off campus” group met only 
online. The study found no significant difference in learner outcomes between the two groups. 

Similarly, Caldwell (2006) found no significant differences in performance on a multiple-choice 
test between undergraduate computer science majors enrolled in a blended course and those 
enrolled in an online course. Both groups used a Web-based platform for instruction, which was 
supplemented by a face-to-face lab component for the blended group. 

Scoville and Buskirk (2007) examined whether the use of traditional or virtual microscopy 
would affect learning outcomes in a medical histology course. Students were assigned to one of 
four sections: (a) a control section where learning and testing took place face-to-face, (b) a 
blended condition where learning took place virtually and the practical examination took place 
face-to-face, (c) a second blended condition where learning took place face-to-face and testing 
took place virtually, and (d) a fully online condition. Scoville and Buskirk found no significant 
differences in unit test scores by learning groups. 

Finally, McNamara et al. (2008) studied the effectiveness of different approaches to teaching a 
weight-training course. They divided students into three groups: a control group that received 



39 




face-to-face instruction, a blended group that received a blend of online and face-to-face 
instruction, and a fully online group. The authors did not find a significant main effect for group 
type.“ 

Thus, as a group, these studies do not provide a basis for choosing online versus blended 
instructional conditions 

Media Elements 

Eight studies in the Category 3 corpus compared online environments using different media 
elements such as one-way video (Maag 2004; McKethan et al. 2003; Schmeeckle 2003; 
Schnitman 2007; Schroeder 2006; Schutt 2007; Tantrarungroj 2008; Zhang et al. 2006). Seven of 
the eight studies found no significant differences among media combinations. In the study that 
found a positive effect from enhanced media features, Tantrarungroj (2008) compared two 
instructional approaches for teaching a neuroscience lesson to undergraduate students enrolled in 
computer science classes. The author contrasted an experimental condition in which students 
were exposed to online text with static graphics and embedded video with a control condition in 
which students did not have access to the streaming video. Tantrarungroj found no significant 
difference in grades for students in the two conditions on a posttest administered immediately 
after the course; however, the treatment group scored significantly higher on a knowledge 
retention test that was administered 4 weeks after the intervention. 

The other seven studies found no effect on learning from adding additional media to online 
instruction. For example, Schnitman (2007) sought to determine whether enhancing text with 
graphics, navigation options, and color would affect learning outcomes. The author randomly 
assigned students to one of two conditions in a Web-based learning interface; the control group 
accessed a plain, text-based interface, and the treatment group accessed an enhanced interface 
that featured additional graphics, navigational options, and an enhanced color scheme. 

Schnitman found no significant differences in learning outcomes between the treatment and 
control groups. 

The fact that the majority of studies found no significant difference across media types is 
consistent with the theoretical position that the medium is simply a carrier of content and is 
unlikely to affect learning per se (Clark 1983, 1994). A study by Zhang et al. (2006) suggests 
that the way in which a medium is used is more important than merely having access to it. Zhang 
et al. found that the effect of video on learning hinged on the learner’s ability to control the video 
(“interactive video”). The authors used four conditions: traditional face-to-face and three online 
environments — interactive video, noninteractive video, and nonvideo. Students were randomly 
assigned to one of the four groups. Students in the interactive video group performed 
significantly better than the other three groups. There was no statistical difference between the 
online group that had noninteractive video and the online group that had no video. 



However, in tests of cognitive knowledge and strength, both the control and blended sections showed significant 
improvements, whereas the fully online section showed no significant pre- to posttest growth for either outcome. 
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In summary, many researchers have hypothesized that the addition of images, graphics, audio, 
video or some combination would enhance student learning and positively affect achievement. 
However, the majority of studies to date have found that these media features do not affect 
learning outcomes significantly. 

Learning Experience Type 

Other Category 3 studies manipulated different features of the online learning environment to 
investigate the effects of learner control or type of learning experience. The learning experience 
studies provide some evidence that suggests an advantage for giving learners an element of 
control over the online resources with which they engage; however, the studies’ findings are 
mixed with respect to the relative effectiveness of the three learning experience types in the 
conceptual framework presented in chapter 2. 

Four studies (Cavus et al. 2007; Dinov, Sanchez and Christou 2008; Gao and Lehman 2003; 
Zhang 2005) provide preliminary evidence supporting the hypothesis that conditions in which 
learners have more control of their learning (either active or interactive learning experiences in 
our conceptual framework) produce larger learning gains than do instructor-directed conditions 
(expository learning experiences). Three other studies failed to find such an effect (Cook et al. 
2007; Evans 2007; Smith 2006). 

Zhang (2005) reports on two studies comparing expository learning with active learning, both of 
which found statistically positive results in favor of active learning. Zhang manipulated the 
functionality of a Web course to create two conditions. For the control group, video and other 
instruction received over the Web had to be viewed in a specified order, videos had to be viewed 
in their entirety (e.g., a student could not fast forward) and rewinding was not allowed. The 
treatment group could randomly access materials, watching videos in any sequence, rewinding 
them and fast forwarding through their content. Zhang found a statistically significant positive 
effect in favor of learner control over Web functionality (see also the Zhang et al. 2006 study 
described above). Gao and Lehman (2003) found that students who were required to complete a 
“generative activity” in addition to viewing a static Web page performed better on a test about 
copyright law than did students who viewed only the static Web page. Cavus, Uzonboylu and 
Ibrahim (2007) compared the success rates of students learning the Java programming language 
who used a standard collaborative tool with the success rate of those who used an advanced 
collaborative tool that allowed compiling, saving and running programs inside the tool. The 
course grades for students using the advanced collaborative tool were higher than those of 
students using the more standard tool. Similarly, Dinov, Sanchez and Christou (2008) integrated 
tools from the Statistics Online Computational Resource in three courses in probability and 
statistics. For each course, two groups were compared: one group of students received a “low- 
intensity” experience that provided them with access to a few online statistical tools; the other 
students received a “high-intensity” condition with access to many online tools for acting on 
data. Across the three classes, pooling all sections, students in the more active, high-intensity 
online tool condition demonstrated better understanding of the material on mid-term and final 
examinations than did the other students. 

These studies that found positive effects for learner control and nondidactic forms of instruction 
are counterbalanced by studies that found mixed or null effects from efforts to provide a more 
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active online learning experience. Using randomly assigned groups of nurses who learned about 
pain management online, Smith (2006) altered the instructional design to compare a text-based, 
expository linear design with an instructional design involving participant problem solving and 
inquiry. No significant difference was found between the two groups in terms of learning 
outcomes. Cook et al. (2007) found no differences in student learning between a condition with 
end-of-module review questions that required active responses and a condition with expository 
end-of-module activities. Evans (2007) explored the effects of more and less expository online 
instruction for students learning chemistry lab procedures. After asking students to complete an 
online unit that was either text-based or dynamic and interactive, Evans found that SAT score 
and gender were stronger predictors of student performance on a posttest with conceptual and 
procedural items than was the type of online unit to which students were exposed. 

Golanics and Nussbaum (2008) examined the effect of “elaborated questions” and “maximizing 
reasons” prompts on students’ ability to construct and critique arguments. Students were 
randomly divided into groups of three; each group engaged in asynchronous discussions. Half of 
the groups received “elaborated questions,” which explicitly instructed them to think of 
arguments and counterarguments, whereas the other half of the groups viewed unelaborated 
questions. In addition, half of the groups randomly received prompts to provide justifications and 
evidence for their arguments (called the “maximizing reasons” condition); half of the groups did 
not receive those prompts. Elaborated questions stimulated better-developed arguments, but 
maximizing reasons instructions did not. 

Chen (2007) randomly assigned students in a health-care ethics class to one of three Web-based 
conditions: (a) a control group that received online instruction without access to an advanced 
organizer; (b) a treatment group that studied a text-based advanced organizer before online 
instruction; and (c) a second treatment group that reviewed an advanced, Elash-based concept 
map organizer before engaging in online learning.^' The authors hypothesized that both the 
advanced organizer and the concept map would help students access relevant prior knowledge 
and increase their active engagement with the new content. Contrary to expectations, Chen found 
no significant differences in learning achievement across the three groups. 

Suh (2006) examined the effect of guiding questions on students’ ability to produce a good 
educational Web site as required in an online educational technology course. Students in the 
guiding-question condition received questions through an electronic discussion board and were 
required to read the questions before posting their responses. E-mails and online postings 
reminded them to think about the guiding questions as they worked through the problem 
scenario. Guiding questions were found to enhance the performance of students working alone, 
but they did not produce benefits for students working in groups. One possible explanation 
offered by the author is that students working in groups may scaffold each other’s work, hence 
reducing the benefit derived from externally provided questions. 



Flash animations are created using Flash software from Adobe; a concept map is a graphic depiction of a set of 
ideas and the linkages among them. 
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Computer-Based Instruction 



The advantage of incorporating elements that are generally found in stand-alone computer-based 
instruction into online learning seems to depend on the nature of the contrasting conditions. 
Quizzes, simulations and individualized instruction, all common to stand-alone computer-based 
instruction, appear to vary in their effectiveness when added to an online learning environment. 

Online Quizzes 

Research on incorporating quizzes into online learning does not provide evidence that the 
practice is effective. The four studies that examined the effectiveness of online quizzes (Lewis 
2002; Maag 2004; Stanley 2006; Tselios et al. 2001) had mixed findings. Maag (2004) and 
Stanley (2006) found no advantage for the inclusion of online quizzes. Maag included online 
quizzes in a treatment condition that also provided students with online images, text and some 
animation; the treatment group was compared with other groups, which differed both in the 
absence of online quizzes and in terms of the media used (one had the same text and images 
delivered online, one had printed text only, and one had printed text plus images). Maag found 
no significant difference between the online group that had the online quizzes and the online 
group that did not. Stanley (2006) found that outcomes for students taking weekly online quizzes 
did not differ statistically from those for students who completed homework instead. 

Two other studies suggested that whether or not quizzes positively affect learning may depend 
on the presence of other variables. Lewis (2002) grouped students into two cohorts. For six 
modules. Group 1 took online quizzes and Group 2 participated in online discussions. For six 
other modules, the groups switched so that those who had been taking the online quizzes 
participated in online discussions and vice versa. When Group 1 students took the online quizzes, 
they did significantly better than those participating in discussions, but no difference was found 
between the groups when Group 2 took the online quizzes in the other six modules. The 
researchers interpreted this interaction between student group and condition in terms of the 
degree of interactivity in the online discussion groups. Group 1 was more active in the online 
discussions, and the authors suggested that this activity mitigated any loss in learning otherwise 
associated with not taking quizzes. 

Tselios et al. (2001) suggest that the software platform used to deliver an online quiz may affect 
test performance. In their study, students completing an online quiz in WebCT performed 
significantly better than students taking the online quiz on a platform called IDLE. The 
educational content in the two platforms was identical and their functionality was similar; 
however, they varied in the details of their user interfaces. 

Simulations 

The results of three studies exploring the effects of including different types of online simulations 
were modestly positive. Two of the studies indicated a positive effect from including an online 
simulation; however, one study found no significant difference. In an online module on 
information technology for undergraduate psychology students, Castaneda (2008) contrasted two 
simulation conditions (one provided a simulation that students could explore as they chose, and 
the other guided the students’ interaction with the simulation, providing some feedback and 
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expository material) with a condition that included no simulation. Castaneda also manipulated 
the sequencing of instructional activities, with the interaction with the simulation coming either 
before or after completion of the expository portion of the instructional module. Knowledge 
gains from pre- to posttest were greater for students with either type of simulation, provided they 
were exposed to it after, rather than before, the expository instruction. 

Hibelink (2007) explored the effectiveness of using two-dimensional versus three-dimensional 
images of human anatomy in an online undergraduate human anatomy lab. The group of students 
that used three-dimensional images had a small, but significant advantage in identifying 
anatomical parts and spatial relationships. Contrasting results were obtained by Loar (2007) in an 
examination of the effects of computer-based case study simulations on students’ diagnostic 
reasoning skills in nurse practitioner programs. All groups received identical online lectures, 
followed by an online text-based case study for one group and by completion of a computer- 
simulated case study for the other. No difference was found between the group receiving the case 
simulation versus that receiving the text-based version of the same case. 

Individualized Instruction 

The online learning literature has also explored the effects of using computer-based instruction 
elements to individualize instruction so that the online learning module or platform responds 
dynamically to the participant’s questions, needs or performance. There were only two online 
learning studies of the effects of individualizing instruction, but both found a positive effect. 
Nguyen (2007) compared the experiences of people learning to complete tax preparation 
procedures, contrasting those who used more basic online training with those who used an 
enhanced interface that incorporated a context-sensitive set of features, including integrated 
tutorials, expert systems, and content delivered in visual, aural and textual forms. Nguyen found 
that this combination of enhancements had a positive effect. 

Grant and Courtoreille (2007) studied the use of post-unit quizzes presented either as (a) fixed 
items that provided feedback only about whether or not the student’s response was correct or (b) 
post-unit quizzes that gave the student the opportunity for additional practice on item types that 
had been answered incorrectly. The response- sensitive version of the tutorial was found to be 
more effective than the fixed-item version, resulting in greater changes between pre- and posttest 
scores. 

Supports for Learner Reflection 

Nine studies (Bixler 2008; Chang 2007; Chung, Chung and Severance 1999; Cook et al. 2005; 
Crippen and Earl 2007; Nelson 2007; Saito and Miwa 2007; Shen, Lee and Tsai 2007; Wang et 
al. 2006) examined the degree to which promoting aspects of learner reflection in a Web-based 
environment improved learning outcomes. These studies found that a tool or feature prompting 
students to reflect on their learning was effective in improving outcomes. 

For example, Chung, Chung and Severance (1999) examined how computer prompts designed to 
encourage students to use self-explanation and self-monitoring strategies affected learning, as 
measured by students’ ability to integrate ideas from a lecture into writing assignments. Chung et 
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al. found that students in the group receiving the computer prompts integrated and elaborated a 
significantly higher number of the concepts in their writing than did those in the control group. 

In a quasi-experimental study of Taiwan middle school students taking a Web-based biology 
course, Wang et al. (2006) found that students in the condition using a formative online self- 
assessment strategy performed better than those in conditions using traditional tests, whether the 
traditional tests were online or administered in paper-and-pencil format. In the formative online 
assessment condition, when students answered an item incorrectly, they were told that their 
response was not correct, and they were given additional resources to explore to find the correct 
answer. (They were not given the right answer.) This finding is similar to that of Grant and 
Courtoreille (2007) described above. 

Cook et al. (2005) investigated whether the inclusion of “self-assessment” questions at the end of 
modules improved student learning. The study used a randomized, controlled, crossover trial, in 
which each student took four modules, two with the self-assessment questions and two without. 
The order of modules was randomly assigned. Student performance was statistically higher on 
tests taken immediately after completion of modules that included self-assessment questions than 
after completion of those without such questions — an effect that the authors attributed to the 
stimulation of reflection. This effect, however, did not persist on an end-of-course test, on which 
all students performed similarly. 

Shen, Lee and Tsai (2007) found a combination of effects for self-regulation and opportunities to 
learn through realistic problems. They compared the performance of students who did and did 
not receive instruction in self-regulation learning strategies such as managing study time, goal- 
setting and self-evaluation. The group that received instruction in self-regulated learning 
performed better in their online learning. 

Bixler (2008) examined the effects of question prompts asking students to reflect on their 
problem-solving activities. Crippen and Earl (2007) investigated the effects of providing students 
with examples of chemistry problem solutions and prompts for students to provide explanations 
regarding their work. Chang (2007) added a self-monitoring form for students to record their 
study time and environment, note their learning process, predict their test scores and create a 
self-evaluation. Saito and Miwa (2007) investigated the effects of student reflection exercises 
during and after online learning activities. Nelson (2007) added a learning guidance system 
designed to support a student’s hypothesis generation and testing processes without offering 
direct answers or making judgments about the student’s actions. In all of these studies, the 
additional reflective elements improved students’ online learning. 

Overall, the available research evidence suggests that promoting self-reflection, self-regulation 
and self-monitoring leads to more positive online learning outcomes. Features such as prompts 
for reflection, self-explanation and self-monitoring strategies have shown promise for improving 
online learning outcomes. 



45 




Moderating Online Groups 

Organizations providing or promoting online learning generally recommend the use of 
instructors or other adults as online moderators, but research support for the effects of this 
practice on student learning is mixed. A study by Bernard and Lundgren-Cayrol (2001) suggests 
that instructor moderation may not improve learning outcomes in all contexts. The study was 
conducted in a teacher education course on educational technology in which the primary 
pedagogical approach was collaborative, project-based learning. Students in the course were 
randomly assigned to groups receiving either low or high intervention on the part of a moderator 
and composed of either random or self- selected partners. The study did not find a main effect for 
moderator intervention. In fact, the mean examination scores of the low-moderation, random- 
selection groups were significantly higher than those of the other groups. A study by De Wever, 
Van Winckel and Valcke (2008) also found mixed effects resulting from instructor moderation. 
This study was conducted during a clinical rotation in pediatrics in which knowledge of patient 
management was developed through case-based asynchronous discussion groups. Researchers 
used a crossover design to create four conditions based on two variables: the type of moderator 
(instructor moderator versus student moderator) and the presence of a developer of alternatives 
for patient management (assigned developer versus no assigned developer). The presence of a 
course instructor as moderator was found not to improve learning outcomes significantly. When 
no assigned developer of alternatives was assigned, the two moderator conditions performed 
equivalently. When a developer of alternatives was specified, the student-moderated groups 
performed significantly better than the instructor-moderated groups. 

Alternately, Zhang (2004) found that an externally moderated group scored significantly higher 
on problems calling for use of statistical knowledge and problem-solving skills than a peer- 
controlled group on both well- and ill-structured problems. Zhang’s study compared the 
effectiveness of peer versus instructor moderation of online asynchronous collaboration. 

Students were randomly assigned to one of two groups. One group had a “private” online space 
where students entirely controlled discussion. The other group’s discussion was moderated by 
the instructor, who also engaged with students through personal e-mails and other media. 

Scripts for Online Interaction 

Four Category 3 studies investigated alternatives to human moderation of online discussion in 
the form of “scaffolding” or “scripts” designed to produce more productive online interaction. 
The majority of these studies indicated that the presence of scripts to guide interactions among 
groups learning together online did not appear to improve learning outcomes. 

The one study that found positive student outcomes for learners who had been provided scripts 
was conducted by Weinberger et al. (2005). These researchers created two types of scripts: 
“epistemic scripts,” which specified how learners were to approach an assigned task and guided 
learners to particular concepts or aspects of an activity, and “social scripts,” which structured 
how students should interact with each other through methods such as gathering information 
from each other by asking critical questions. They found that social scripts improved 
performance on tests of individual knowledge compared with a control group that participated in 
online discussions without either script (whether or not the epistemic script was provided). 
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The remaining three studies that examined the effect of providing scripts or scaffolds for online 
interaction found no significant effect on learning (Choi, Land and Turgeon 2005; Hron et al. 
2000; Ryan 2007). Hron et al. (2000) used an experimental design to compare three groups: (a) a 
control that received no instructions regarding a 1-hour online discussion, (b) a group receiving 
organizing questions to help structure their online communication and (c) a group receiving both 
the organizing questions and rules for discussion. The discussion rules stated that group members 
should discuss only the organizing questions; that discussion of one question had to be 
completed before the next discussion was begun; that the discussion needed to be structured as 
an argument, with claims justified and alternative viewpoints considered; and that all participants 
should take turns moderating the discussion and making sure that the discussion adhered to the 
rules. Hron et al. found statistically significant differences across conditions in the content and 
coherence of student postings, but no difference across the three groups in terms of knowledge 
acquisition as measured by a multiple-choice test. 

Ryan’s study (2007) reached conclusions similar to those of Hron et al. Ryan hypothesized that 
exposure to collaborative tools would affect student performance. He compared two groups of 
middle school students: a treatment group, which engaged in online learning that included 
interaction with instructors and peers using online collaboration tools, and a control group, which 
did not have access to or instruction in the use of collaboration tools. Like Hron et al., Ryan 
found no significant difference in academic performance between the two groups of online 
students. 

Choi, Land and Turgeon (2005) used a time-series control-group design to investigate the effects 
of providing online scaffolding for generating questions to peers during online group 
discussions. Although scaffolds were found to increase the number of questions asked, they did 
not affect question quality or learner outcomes. 

In summary, mechanisms such as scaffolds or scripts for student group interaction online have 
been found to influence the way students engage with each other and with the online material, 
but have not been found to improve learning. 

Delivery Platform 

Several platform options are available for online learning — an exclusively Web-based 
environment or e-mail or mobile phone. The alternative platforms can be used as primary 
delivery channels or as supplements to Web-based instruction. Neither of the two studies that 
addressed this issue found significant differences across delivery platforms. Shih (2007) 
investigated whether student groups who accessed online materials by means of mobile phone 
demonstrated significantly different learning outcomes from groups who did so using a 
traditional computer; the author found no statistical difference between the two groups. 

Similarly, Kerfoot (2008) compared the effects of receiving course materials and information 
through a series of e-mails spaced out over time versus accessing the online materials all at once 
by means of a traditional Web-site and found no statistical difference. 

Overall, the controlled studies are too few to support even tentative conclusions concerning the 
learning effects of using alternative or multiple delivery platforms for online learning. 
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Summary 



This narrative review has illustrated the many variations in online, individual and group, and 
synchronous and asynchronous activities that can be combined in a course or instructional 
intervention. The number of Category 3 studies concerning any single practice was insufficient 
to warrant a quantitative meta-analysis, and the results varied to such an extent that only 
tentative, rather than firm, conclusions can be drawn about promising online learning practices. 

The direct comparison of blended and purely online conditions in 10 studies produced mostly 
null results, tempering what appeared to be an advantage of blended compared with purely 
online instruction in the moderator variable analysis that was conducted as part of the meta- 
analysis presented in chapter 3. Although a fair number of Category 3 studies contrasted these 
two versions of online learning, few equated instructional content or activities across conditions, 
making it difficult to draw conclusions. 

With respect to incorporation of multiple media, the evidence available in the Category 3 studies 
suggests that inclusion of more media in an online application does not enhance learning when 
content is controlled, but some evidence suggests that the learner’s ability to control the learning 
media is important (Zhang 2005; Zhang et al. 2006). Alternately, the set of studies using various 
manipulations to try to stimulate more active engagement on the part of online learners (such as 
use of advanced organizers, conceptual maps, or guiding questions) had mostly null results. 

The clearest recommendation for practice that can be made on the basis of the Category 3 
synthesis is to incorporate mechanisms that promote student reflection on their level of 
understanding. A dozen studies have investigated what effects manipulations that trigger learner 
reflection and self-monitoring of understanding have on individual students’ online learning 
outcomes. Ten of the studies found that the experimental manipulations offered advantages over 
online learning that did not provide the trigger for reflection. 

Another set of studies explored features usually associated with computer-based instruction, 
including the incorporation of quizzes, simulations, and techniques for individualizing 
instruction. The providing of simple multiple-choice quizzes did not appear to enhance online 
learning. The incorporation of simulations produced positive effects in two out of three studies 
(Castaneda 2008; Hibelink 2007). Individualizing online learning by dynamically generating 
learning content based on the student’s responses was found to be effective in the two studies 
investigating this topic (Grant and Courtoreille 2007; Nguyen 2007). 

Attempts to guide the online interactions of groups of learners were less successful than the use 
of mechanisms to prompt reflection and self-assessment on the part of individual learners. Some 
researchers have suggested that students who learn in online groups provide scaffolds for one 
another (Suh 2006). 
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Finally, readers should be cautioned that the literature on alternative online learning practices has 
been conducted for the most part by professors and other instructors who are conducting research 
using their own courses. Moreover, the combinations of technology, content and activities used 
in different experimental conditions have often been ad hoc rather than theory based. As a result, 
the field lacks a coherent body of linked studies that systematically test theory-based approaches 
in different contexts. 
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5. Discussion and Implications 

The meta-analysis reported here differs from prior meta-analyses of distance learning in several 
important respects: 

• Only studies of Web-supported learning have been included. 

• All effects have been based on objective measures of learning. 

• Only studies with controlled designs that met minimum quality criteria have been 
included. 

The corpus of 5 1 effect sizes extracted from 46 studies meeting these criteria was sufficient to 
demonstrate that in recent applications, online learning has been modestly more effective, on 
average, than the traditional face-to-face instruction with which it has been compared. 

The test for homogeneity of effects found significant variability in the effect sizes for the 
different online learning studies, justifying a search for moderator variables that could explain 
the differences in outcomes. The moderator variable analysis found only three moderators 
significant at p < .05. Effects were larger when a blended rather than a purely online condition 
was compared with face-to-face instruction; when students in the online condition spent more 
time learning than did students in the face-to-face condition; and when the curricular materials 
and instruction varied between the online and face-to-face conditions. This pattern of significant 
moderator variables is consistent with the interpretation that the advantage of online conditions 
in these recent studies stems from aspects of the treatment conditions other than the use of the 
Internet for delivery per se. 

Clark (1983) has cautioned against interpreting studies of instruction in different media as 
demonstrating an effect for a given medium inasmuch as conditions may vary with respect to a 
whole set of instructor and content variables. That caution applies well to the findings of this 
meta-analysis, which should not be construed as demonstrating that online learning is superior as 
a medium. Rather, it is the combination of elements in the treatment conditions, which are likely 
to include additional learning time and materials as well as additional opportunities for 
collaboration, that has proven effective. The meta-analysis findings do not support simply 
putting an existing course online, but they do support redesigning instruction to incorporate 
additional learning opportunities online. 

Several practices and conditions associated with differential effectiveness in distance education 
meta-analyses (most of which included nonleaming outcomes such as satisfaction) were not 
found to be significant moderators of effects in this meta-analysis of Web-based online learning. 
Nor did tests for the incorporation of instructional elements of computer-based instruction (e.g., 
online practice opportunities and feedback to learners) find that these variables made a 
difference. Online learning conditions produced better outcomes than face-to-face learning alone, 
regardless of whether these instructional practices were used. 

The meta-analysis did not find differences in average effect size between studies published 
before 2004 (which might have used less sophisticated Web-based technologies than those 
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available since) and studies published from 2004 on (possibly reflecting the more sophisticated 
graphics and animations or more complex instructional designs available). Nor were differences 
associated with the nature of the subject matter involved. 

Finally, the examination of the influence of study method variables found that effect sizes did not 
vary significantly with study sample size or with type of design. It is reassuring to note that, on 
average, online learning produced better student learning outcomes than face-to-face instruction 
in those studies with random- assignment experimental designs {p < .001) and in those studies 
with the largest sample sizes {p < .001). 

The relatively small number of studies meeting criteria for inclusion in this meta-analysis limits 
the power of tests for moderator variables. A few contrasts that did not attain significance (e.g., 
learning experience or type of knowledge tested) might have emerged as significant influences 
under a fixed-effects analysis and may prove significant when tested in future meta-analyses 
with a larger corpus of studies. 

The narrative synthesis of studies comparing variations of online learning provides some 
additional insights with respect to designing effective online learning experiences. The practice 
with the strongest evidence of effectiveness is inclusion of mechanisms to prompt students to 
reflect on their level of understanding as they are learning online. In a related vein, there is some 
evidence that online learning environments with the capacity to individualize instruction to a 
learner’s specific needs improves effectiveness. 

As noted in chapter 4, the results of studies using purely online and blended conditions cast some 
doubt on the meta-analysis finding of larger effect sizes for studies blending online and face-to- 
face elements. The inconsistency in the implications of the two sets of studies underscores the 
importance of recognizing the confounding of practice variables in most studies. Studies using 
blended learning also tend to involve more learning time, additional instructional resources, and 
course elements that encourage interactions among learners. This confounding leaves open the 
possibility that one or all of these other practice variables, rather than the blending of online and 
offline media per se, accounts for the particularly positive outcomes for blended learning in the 
studies included in the meta-analysis. 

Comparison With Meta-Anaiyses of Distance Learning 

Because online learning has much in common with distance learning, it is useful to compare the 
findings of the present meta-analysis with the most comprehensive recent meta-analyses in the 
distance-learning field. The two most pertinent earlier works are those by Bernard et al. (2004) 
and Zhao et al. (2005). As noted above, the corpus in this meta-analysis differed from the earlier 
quantitative syntheses, not only in including more recent studies but also in excluding studies 
that did not involve Web-based instruction and studies that did not examine an objective student 
learning outcome. 

Bernard et al. (2004) found advantages for asynchronous over synchronous distance education, a 
finding that on the surface appears incongruent with the results reported here. On closer 
inspection, however, it turns out that the synchronous distance-education studies in the Bernard 
et al. corpus were mostly cases of a satellite classroom yoked to the main classroom where the 
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instructor taught. It is likely that the nature of the learning experience and extent of collaborative 
learning were quite different in the primary and distant classrooms in these studies. For 
asynchronous distance education, Bernard et al. also found that the distance-education condition 
tended to have more favorable outcomes when opportunities for computer-mediated 
communication were available. Online learners in all of the studies in this meta-analysis had 
access to computer-mediated communication and in every case there were mechanisms for 
asynchronous communication. 

Zhao et al. (2005) found advantages for blended learning (combining elements of online and 
face-to-face communication) over purely online learning experiences, a finding similar to that of 
this meta-analysis. Zhao et al. also found that instructor involvement was a strong mediating 
variable. Distance learning outcomes were less positive when instructor involvement was low (as 
in “canned” applications), with effects becoming more positive, up to a point, as instructor 
involvement increased. At the highest level of instructor involvement (which would suggest that 
the instructor became dominant and peer-to-peer learning was minimized), effect size started to 
decline in the corpus of studies Zhao et al. examined. Although a somewhat different construct 
was tested in the Learning Experience variable used here, the present results are consonant with 
those of Zhao et al. Studies in which the online learners worked with digital resources with little 
or no teacher guidance were coded here as “independent/active,” and this category was the one 
learner experience category for which the advantage of online learning failed to attain statistical 
significance at the p < .05 level or better. 

The relative disadvantage of independent online learning (called “active” in our conceptual 
model) should not be confused with automated mechanisms that encourage students to be more 
reflective or more actively engaged with the material they are learning on line. As noted above, a 
number of studies reviewed in chapter 4 found positive effects for techniques such as prompts 
that encourage students to assess their level of understanding or set goals for what they will learn 
whereas mechanisms such as guiding questions or advance organizers had mostly null results. 

Implications for K-12 Education 

The impetus for this meta-analysis of recent empirical studies of online learning was the need to 
develop research-based insights into online learning practices for K-12 students. The research 
team realized at the outset that a look at online learning studies in a broader set of fields would 
be necessary to assemble sufficient empirical research for meta-analysis. As it happened, the 
initial search of the literature published between 1996 and 2006 found no studies contrasting K- 
12 online learning with face-to-face instruction that met methodological quality criteria.^^ By 



The initial literature search identified several K-12 online studies comparing student achievement data collected 
from both virtual and regular schools (e.g., Cavanaugh et al. 2004; Schollie 2001), but these studies were neither 
experiments nor quasi-experiments with statistical control for preexisting differences between groups. Some of 
these K-12 studies used a pre-post, within-subject design without a comparison group; others were quasi- 
experiments without a statistical control for preexisting differences among study conditions (e.g., Karp and 
Woods 2003; Long and Stevens 2004; Stevens 1999). Several studies used experimental designs with K-12 
students but did not report the data needed to compute or estimate effect sizes. A few experiments compared a K- 
12 online intervention with a condition in which there was no instruction (e.g., Teague and Riley 2006). Many of 
the references (8 out of 14) used for the Cavanaugh et al. (2004) meta-analysis of K-12 online studies were 
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performing a second literature search with an expanded time frame (through July 2008), the team 
was able to greatly expand the corpus of studies with controlled designs and to identify five 
controlled studies of K-12 online learning with seven contrasts between online and face-to-face 
conditions. This expanded corpus still comprises a very small number of studies, especially 
considering the extent to which secondary schools are using online courses and the rapid growth 
of online instruction in K-12 education as a whole. Educators making decisions about online 
learning need rigorous research examining the effectiveness of online learning for different types 
of students and subject matter as well as studies of the relative effectiveness of different online 
learning practices. 



databases of raw student performance data and did not describe learning conditions, technology use or 
learner/instructor characteristics. A recent large-scale study by the Florida TaxWatch (2007) failed to control for 
preexisting differences between the students taking courses online and those taking them in conventional 
classrooms. 
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Appendix 

Meta-Analysis Methodology 

Terms and Processes Used in the Database Searches 

In March 2007, researchers performed searches through the following four data sources: 

1 . Electronic research databases. Using a common set of keywords (see Exhibit A- 1), 
searches were performed in ERIC, PsycINEO, PubMed, ABI/INEORM, and UMI 
ProQuest Digital Dissertations. In addition, to make sure that studies of online 
learning in teacher professional development and career technical education were 
included, additional sets of keywords, shown in Exhibit A-2, were used in additional 
searches of ERIC and PsycINEO. 

2. Recent meta-analyses and narrative syntheses. Researchers reviewed the lists of 
studies included in Bernard et al. (2004), Cavanaugh et al. (2004), Childs (2001), 
Sitzmann et al. (2006), Tallent- Runnels et al. (2006), Wisher and Olson (2003), and 
Zhao et al. (2005) for possible inclusions. Additionally, for teacher professional 
development and career technical education, references from recent narrative research 
syntheses in those fields (Whitehouse et al. 2006; Zirkle 2003) were examined to 
identify potential studies for inclusion. 

3. Key journals. Abstracts were manually reviewed for articles published since 2005 in 
American Journal of Distance Education, Journal of Distance Education (Canada), 
Distance Education (Australia), International Review of Research in Distance and 
Open Education, and Journal of Asynchronous Learning Networks. In addition, the 
Journal of Technology and Teacher Education and Career and Technical Education 
Research (formerly known as Journal of Vocational Education Research) were 
manually searched. 

4. Google Scholar searches. To complement these targeted searches, researchers used 
limiting parameters and sets of keywords (available from the authors of this report) in 
the Google Scholar search engine. 



A-l 




Exhibit A-1. Terms for initiai Research Database Search 



Technoiogy and 
Education/ 
Training Terms 


Study Design Terms® 


Distance education 


Control group 


Distance learning 


Comparison group 


E-learnIng 


Treatment group 


Online education 


Experimental 


Online learning 




Online training 




Online course 




Virtual learning 




Virtual training 




Virtual & course 




Internet & learning 




Internet & training 




Internet & course 




Web-based learning 




Web-based Instruction 




Web-based course 




Web-based training 




“Distributed learning” 





® All four terms were used In one query with “OR” If the 
database allowed. 



Exhibit A-2. Terms for Additionai Database Searches for Oniine Career Technicai Education and 

Teacher Professionai Deveiopment 



Education Terms 


Technoiogy 

Terms 


Study Design Terms 


Career education 


Distance 


Control group 


Vocational education 


Distributed 


Comparison group 


Teacher education 


E-learnIng 


Experimental 


Teacher mentoring 


Internet 


Randomized 


Teacher professional 
development 


Online 


Treatment group 


Teacher training 


Virtual 




Technical education 


Web-based 
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Additional Sources of Articles 



Exhibit A-3 lists the sources for the resulting 502 articles that went through full-text screening. 



Exhibit A-3. Sources for Articies in the Fuii-Text Screening 





Number of Articies 
identified and Passing 
initiai Screening 


Totai retained for fuii-text screen 


502 


Source of articles in full-text screen: 


Electronic research database searches 


316 


Additional database searches for teacher 
professional development and career 
technical education 


6 


Recent meta-analyses 


171 


Manual review of key journals 


19 


Google Scholar searches 


31 


Recommendations from experts 


3 


Overlaps 


-36 


Unretrievable 


-8 
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Effect Size Extraction 



Of the 176 studies passing the full-text screening, 99 were identified as having at least one 
contrast between online learning and face-to-face or offline learning (Category 1) or between 
blended learning and face-to-face/offline learning (Category 2). These studies were transferred to 
quantitative analysts for effect size extraction. 

Numerical and statistical data contained in the studies were extracted for analysis with 
Comprehensive Meta- Analysis software (Biostat Solutions 2006). Data provided in the form of t- 
tests, F-tests, correlations, p-levels, and frequencies were used for this purpose. 

During the data extraction phase, it became apparent that one set of studies rarely provided 
sufficient data for Comprehensive Meta- Analysis calculation of an effect size. Quasi- 
experimental studies that used hierarchical linear modeling or analysis of covariance with 
adjustment for pretests and other learner characteristics through covariates typically did not 
report some of the data elements needed to compute an effect size. For studies using hierarchical 
linear modeling to analyze effects, typically the regression coefficient on the treatment status 
variable (treatment or control), its standard error, and a p-value and sample sizes for the two 
groups were reported. For analyses of covariance, typically the adjusted means and F-statistic 
were reported along with group sample sizes. In almost all cases, the unadjusted standard 
deviations for the two groups were not reported and could not be computed because the pretest- 
posttest correlation was not provided. Following the advice of Robert Bernard, the chief meta- 
analysis expert on the project’s Technical Working Group, analysts decided to retain these 
studies and to use a conservative estimate of the pretest-posttest correlation (r = .70) in 
estimating an effect size for those studies where the pretest was the same measure as the posttest 
and using a pretest-posttest correlation of r = .50 when it was not. These effect sizes were 
flagged in the coding as “estimated effect sizes,” as were effect sizes computed from t tests, F 
tests, and p levels. 

In extracting effect size data, the analysts followed a set of rules: 

• The unit of analysis was the independent contrast between online condition and face-to- 
face condition (Category 1) or between blended condition and face-to-face condition 
(Category 2). Some studies reported more than one contrast, either by reporting more 
than one experiment or by having multiple treatment conditions (e.g., online vs. blended 
vs. face-to-face) in a single experiment. 

• When there were multiple treatment groups or multiple control groups and the nature of 
the instruction in the groups did not differ considerably (e.g., two treatment groups both 
fell into the “blended” instruction category), then the weighted mean of the groups and 
pooled standard deviation were used. 

• When there were multiple treatment groups or multiple control groups and the nature of 
the instruction in the groups did differ considerably (e.g., one treatment was purely online 
whereas the other treatment was blended instruction, both compared against the face-to- 
face condition), then analysts treated them as independent contrasts. 
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• In general, one learning outcome finding was extracted from each study. When multiple 
learning outcome data were reported (e.g., assignments, midterm and final examinations, 
grade point averages, grade distributions), the outcome that could be expected to be more 
stable and more closely aligned to the instruction was extracted (e.g., final examination 
scores instead of quizzes). However, in some studies, no learning outcome had obvious 
superiority over the others. In such cases, analysts extracted multiple contrasts from the 
study and calculated the weighted average of the multiple outcome scores if the outcome 
measures were similar (e.g., two final tests, one testing procedural skills and the other 
testing declarative knowledge). For example, in one study, analysts retained two outcome 
findings because the outcome measures were quite different (Schilling et al. 2006). One 
measure was a multiple-choice test, examining basic knowledge, whereas the other was a 
performance-based assessment, testing students’ strategic and problem-solving skills in 
the context of ill-structured problems. 

• Learning outcome findings were extracted at the individual level. Analysts did not extract 
group-level learning outcomes (e.g., scores for a group product). Too few group products 
were included in the studies to support analyses of this variable. 

The review of the 99 studies for effect size calculation produced 5 1 independent effect sizes (28 
for Category 1 and 23 for Category 2) from 46 studies; 53 studies did not report sufficient data to 
support effect-size calculation. 

Coding of Study Features 

All studies that provided enough effect size data were coded for their study features and for study 
quality. The top-level coding structure, incorporating refinements made after pilot testing, is 
shown in Exhibit A-4. (The full coding structure is available from the authors of this report.) 

Twenty percent of the studies with sufficient data to compute effect size were coded by two 
researchers. The interrater reliability across these double-coded studies was 86.4 percent. As a 
result of analyzing coder disagreements, some definitions and decision rules for some codes were 
refined; other codes that required information missing in the vast majority of documents or that 
proved difficult to code reliably (e.g., indication of whether the instructor was certified or not) 
were eliminated. A single researcher coded the remaining studies. 
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Exhibit A-4. Top-ievei Coding Structure for the Meta-anaiysis 

Study Feature Coding Categories 

• Study type 

• Type of publication 

• Year of publication 

• Study author 

• Whether the instructor was trained in online training 

• Learner type 

• Learner age 

• Learner incentive for involvement in the study 

• Learning setting 

• Subject matter 

• Treatment duration 

• Dominant approach to learner control 

• Media features 

• Opportunity for face-to-face contact with the instructor 

• Opportunity for face-to-face contact with peers 

• Opportunity for asynchronous computer-mediated communication with the instructor 

• Opportunity for asynchronous computer-mediated communication with peers 

• Opportunity for synchronous computer-mediated communication with the instructor 

• Opportunity for synchronous computer-mediated communication with peers 

• Use of problem-based or project-based learning 

• Opportunity for practice 

• Opportunity for feedback 

• Type of media-supported pedagogy 

• Nature of outcome measure 

• Nature of knowledge assessed 

Study Design Codes 

• Unit of assignment to conditions 

• Sample size for unit of assignment 

• Student equivalence 

• Whether equivalence of groups at preintervention was described 

• Equivalence of prior knowledge/pretest scores 

• Instructor equivalence 

• Time-on-task equivalence 

• Curriculum material/instruction equivalence 

• Attrition equivalence 

• Contamination 
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